SlideShare a Scribd company logo
1 of 13
Download to read offline
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
A Logistic Regression Model to Identify Factors Influencing
Cassava Productivity in the Southern Part of Sierra Leone
*1Regina Baby Sesay, 2Ahmed Koneh
1,2Department of Mathematics and Statistics, School of Technology, Njala University, Njala, Sierra Leone
The role of cassava as food and cash crop in Sierra Leone has contributed immensely to the
country's economic development. This includes providing employment facilities for Sierra
Leoneans. Cassava is the second largest food crop grown across the country. Despite its
importance and tremendous contributions to the country's economic development, its production
faces several constraints. This work, therefore, focused on using a statistical modeling technique
to key out the major factors influencing cassava productivity in the southern part of Sierra Leone.
It further measured the effect of each factor on cassava productivity. A multiple binary logistic
regression modeling technique were used in the empirical analysis. Two hundred cassava
farmers were randomly selected from the communities in the study area. Cassava productivity
was measured by the level of cassava yield. Initially, several factors were considered as possible
determinants of the level of cassava yield. However, the empirical analysis showed that farm size,
educational level, and age by farming experience are the main factors influencing cassava
productivity in the study area. Increase in farm size can increase cassava yield whiles an increase
in educational level may decrease cassava productivity. Older people with more farming
experience can contribute significantly to cassava productivity.
Key words: Predictors, Sensitivity, Farmers, Yield, Sierra Leone
INTRODUCTION
Cassava with a botanical name Manihot esculenta Crantz
originated from South America. It is extensively
propagated as an annual crop in the tropical and
subtropical regions for its edible starchy tuber (FAO,
2003a). Cassava is a perennial shrub grown throughout
lowland tropical regions.
The role of cassava as food and cash crop has contributed
immensely to the economic development of Sierra Leone.
Cassava is the second largest food crop grown across the
country, with an annual yield of 350,000 tons in 2006
(Sanni et al., 2009). The main areas of production are the
South-West, central and far north of the Country. It is one
of the most important food crops in Sierra Leone as it
serves as a major source of carbohydrate (FAO, 2004)
According to FAO estimates, 172 million tons of cassava
were produced worldwide in 2000. Africa accounted for
54% (FAO, 2003b). Also, in 2002, world production of
cassava tuber was estimated to be 184 million tons, the
majority of production was in Africa, where 99.1 million
tons were grown (FAO, 2003b).
In Sierra Leone, the significance of cassava cannot be
overemphasized as it stands out to be the main
supplement to rice, which is a well-known staple food for
Sierra Leoneans. Nearly 90% of cassava produced is for
human consumption; less than 10% are semi-processed
for on-farm animal feed (Sanni et al. 2009). This is clearly
seen in the provinces during the raining season, as the
demand for food shifts from the staple food, rice to cassava
due to an increase in the price of rice.
Moreover, annual population growth is about 2.8% in most
West African countries, while urban growth is generally
significantly higher than rural growth. An annual urban
growth rate of 5% for a 10-year period implies a 63%
*Corresponding Author: Regina Baby Sesay,
Department of Mathematics and Statistics, School of
Technology, Njala University, Njala, Sierra Leone.
E-mail: regisesay@yahoo.com; Tel: +23279235912.
Co-Author Email: ahmedkonneh@gmail.com, Tel:
+23279782822
Research Article
Vol. 5(2), pp. 592-604, September, 2019. © www.premierpublishers.org, ISSN: 2167-0477
Journal of Agricultural Economics and Rural Development
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
Sesay and Koneh 593
increase in the urban population and the demand for food
(Essers et al. 2005). To feed the urban dwellers, food
supply from every farm household has to increase by at
least 63% in 10 years (Sanni et al. 2009). This clearly
points out the necessity for an increase in the growth of a
supplementary food crop like Cassava.
Despite its importance and tremendous contributions,
cassava production in Sierra Leone faces several
constraints. Some of these constraints are: inadequate
funding; lack of farming experience; lack of availability of
land for farming and the educational level of cassava
farmers.
This study, therefore, aims to key out the major factors
influencing cassava productivity in the Moyamba District,
Southern Province of Sierra Leone. It used a logistic
regression modeling technique to identify the key
determinants of cassava productivity and to measure the
effect of each determinant on the yield of cassava grown
in the study area.
MATERIALS AND METHODS
Theoretical Frameworks
This section focuses on the review of the theoretical and
conceptual frameworks of using a logistic regression
method for analyzing categorical outcome. It also points
out the main statistics used in the logistics regression
model checking.
Logistic Regression
Regression analysis is a predictive modeling technique. It
investigates and estimates the relationship between a
variable of interest called the dependent or target variable
and one or more variables that may have an influence on
the dependent variable called predictor(s). Based on the
type of dependent variable(s), the number of independent
variables and shape of the regression line, there exist
different regression techniques used to investigate
relevant relationships and to make valuable predictions.
Among these numerous regression techniques, this work
used a multiple binary logistic regression modeling
technique to investigate the factors influencing cassava
production in the Moyamba District. Cassava productivity
was measured by the level of cassava yield. A multiple
binary logistic regression was used, multiple’ because
there were over one independent variable, ‘binary’
because the variable of interest, called the dependent
variable was dichotomous (high or low yield) and ‘logistic’
because of lack of linearity between the dependent
variable and the independent variable (s).
In building the logistic regression model to achieve the
purpose of this research work, the following concepts and
statistics were considered:
The Binomial Distribution
The binomial distribution is appropriate to use as an error
distribution in logistic regression because:
1. the outcome of interest is dichotomous (a success or a
failure); and
2. a number of independent trials are considered.
Let:
𝑦𝑖 = {
1 𝑖𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ
𝑓𝑎𝑟𝑚′𝑠 cassava 𝑦𝑖𝑒𝑙𝑑 𝑖𝑠 ℎ𝑖𝑔ℎ
0 𝑖𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ
𝑓𝑎𝑟𝑚′𝑠 cassava 𝑦𝑖𝑒𝑙𝑑 𝑖𝑠 𝑙𝑜𝑤
Equation (1)
where 𝑦𝑖 is the level of the yield for farm i. Here, 𝑦𝑖 is
considered as a realization of a random variable 𝑌𝑖 that
can take the values one and zero with probabilities 𝑝𝑖 and
1 − 𝑝𝑖 respectively. The distribution of 𝑌𝑖 is called a
bernoulli distribution with parameter 𝑝𝑖 and can be written
as
𝑝𝑟(𝑌𝑖 = 𝑦𝑖) = 𝑝𝑖
𝑦 𝑖
(1 − 𝑝𝑖)1−𝑦 𝑖 Equation (2)
for 𝑦𝑖 = 0,1. If 𝑦𝑖 = 1 𝑝𝑖 is obtained, and if 𝑦𝑖 = 0 1 −
𝑝𝑖 is obtained.
Logistic Regression Model
From the above discussion of the binomial distribution, the
logistic regression model can be understood as a means
of finding the 𝛽 parameters that best fit:
𝑦𝑖 = {
1 β0 + β1x + ε > 0
0 𝑒𝑙𝑠𝑒
Equation (3)
Where 𝜀 is an error term
In short, if 𝑝̂ is the predicted probability that 𝑌 = 1, given
the values of 𝑥1, … , 𝑥 𝑘,
the model assumes that
log
𝑝̂
(1−𝑝̂)
= 𝛽0 + 𝛽1 𝑥1+, … , 𝛽 𝑝 𝑥 𝑘 Equation (4)
Where 𝑌~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑝̂)
Parameter Interpretation
Unlike the simple linear model, 𝑌 = 𝛽0 + 𝛽1 𝑥1 indicating
that if x increases by 1, Y increases by .𝛽1 , in a logistic
regression model, it is log
𝑝̂
(1−𝑝̂)
which increases by .𝛽1. To
see this,let the predicted probability of the event of interest
be 𝑝0 when 𝑥 = 0 and 𝑝̂1 when 𝑥 = 1, then
log
𝑃̂0
(1 − 𝑃̂0)
= 𝛽0
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
J. Agric. Econ. Rural Devel. 594
log
𝑝̂1
1 − 𝑝̂1
= 𝛽0 + 𝛽1
log
𝑝̂1
1 − 𝑝̂1
= log
𝑝̂0
1 − 𝑝̂0
+ 𝛽1
Taking exponent on both sides of this equation we have:
𝑒
log(
𝑝1
1−𝑝̂1
)
= 𝑒
log
𝑝̂0
1−𝑝̂0
+𝛽1
This gives
𝑝1
1−𝑝̂1
=
𝑝̂0
1−𝑝̂0
× 𝑒 𝛽1
Equation (5)
This means, when x increases by 1, the odds of a positive
outcome increase by a factor
of 𝑒1
𝛽
. Therefore, 𝑒1
𝛽
is called the odds ratio for a unit
increase in x.
To be specific, the odd ratio for a continuous independent
variable, 𝑂𝑅 𝑐 can be defined as:
𝑂𝑅 𝑐 =
𝑜𝑑𝑑𝑠(𝑥+1)
𝑜𝑑𝑑𝑠(𝑥)
=
𝐹(𝑥+1)
1−𝐹(𝑥+1)
𝐹(𝑥)
1−𝐹(𝑥)
=
𝑒 𝛽0+𝛽1(𝑥+1)
𝑒 𝛽0+𝛽1 𝑥 = 𝑒 𝛽1
Equation (6)
In case of a binary independent variable, the odds ratio
can be define as
𝑎𝑑
𝑏𝑐
, where a, b, c and d are cells in a 2×2
contingency table
Measures of fit for Logistic Regression
Like any classical linear model, a vital part of logistic
regression analysis is how well the model fits the Data.
Before trusting the result of a model to make valid
conclusions or predict future outcomes, it is important to
check the model beyond all reasonable doubt to make sure
that the model assumed is correctly specified and the data
at hand do not conflict with assumptions made by the
model.
The residuals or differences between observed and fitted
values were the raw materials used in these tests.
Deviance Goodness-of-Fit Test
The deviance goodness-of-fit test assesses the
discrepancy between the current model and the full model.
The deviance statistic denoted as D2
is thus;
𝐷2
= 2 log Ls (β̂) − log Lm (β̂) Equation (7)
where
log Lm(β̂) = maximized log-likelihood of the fitted model
log Ls(β̂) = maximized log-likelihood of the saturated
model
Evidence for model lack-of-fit occurs when the value of D2
is large
Pearson Goodness-of-Fit Test
The Pearson goodness-of-fit test also assesses the
discrepancy between the current model and the full model.
The test-statistic is:
𝜒2
= ∑
(Oi−Ei)2
Ei
𝑛
𝑗=1 = N ∑
(
Oi
N
⁄ −pi)
2
pi
n
i=1 Equation (8)
where
𝜒2
= Pearson's cumulative test statistic, which
asymptotically approaches a 𝜒2
distribution.
Oj == the number of observations of type j.
N= total number of observations
Ej = NPj = the expected (theoretical) frequency of type j,
asserted by the null hypothesis that the fraction of type j in
the population is pj
nj = the number of cells in the table.
Hosmer Lemeshow
This goodness-of-fit test was used to determine whether
the predicted probabilities deviate from the observed
probabilities in a way that the binomial distribution does not
predict. If the p-value for the goodness-of-fit test is lower
than the chosen significance level, the predicted
probabilities deviate from the observed probabilities in a
way that the binomial distribution cannot predict.
Hosmer and Lemeshow (2000) recommended partitioning
the observations into 10 equal sized groups according to
their predicted probabilities. So that
𝐺 𝐻𝐿
2
= ∑
(𝑂 𝐽−𝐸 𝐽)
2
𝐸 𝐽(1−𝐸 𝑗/𝑛 𝑗)
10
𝑗=1 ~𝜒8
2
Equation (9)
𝑛𝐽 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑗 𝑡ℎ
groug
𝑂𝑗 = ∑ 𝑦𝑖𝑗𝑖 = Observed number cases in the 𝑗 𝑡ℎ
groug
𝐸𝑗 =expected number of cases in the 𝑗 𝑡ℎ
group
Measures of the Predictive Power of the Logistic
Regression Model
The R2 statistics for logistic regression was used to
measure the predictive power of the model. There are
different versions of R2 in the statistics literature, but this
work used the Nagelkerke and Cox and Snell R2 Squares
produced by SPSS.
In using R2, adding any variable may tend to increase it's
value, even if that variable is irrelevant. For this reason,
the adjusted R2 is preferably used to access the predictive
power of the logistic legression model.
Cox and Snell R2 :
It’s sometimes referred to as a
“pseudo” R2 .
The Cox and Snell R2 is
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
Sesay and Koneh 595
𝑅 𝐶&𝑆
2
= 1 − (
𝐿 𝑂
𝐿 𝑀
)
2
𝑛
Equation (10)
where n is the sample size
Nagelkerke R Square: The Nagelkerke R Square adjusts
the Cox and Snell’s R Square so that the range of possible
values extends to 1. This is achieved by dividing the Cox
and Snell R-squared by its maximum possible value,
1 − 𝐿(𝑀𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡)
2
𝑁⁄
Equation (11)
So, if the full model perfectly predicts the outcome and has
a likelihood of 1, Nagelkerke R-squared = 1. This implies,
When L(Mfull) = 1, R2 = 1 and when L(Mfull) = L(Mintercept),
R2 = 0. The Nagelkerke R Square is:
Equation (12)
Methodology
This section introduces stages involved in the data
analysis. It also points out the type of data analysis
adopted at each stage together with the need for each
analysis.
Description of study area
This study was carried out in the Moyamba District, the
southern part of Sierra Leone with. a population of 318,064
in the 2015 population and housing census (Statistics
Sierra Leone, 2015). Moyamba District has a seasonal
variation like any other parts of the country. It has a rainy
season that starts in May and ends in October and a dry
season that starts in November and ends in April. One of
the main occupations of people living in this part of the
country is farming.
Sampling Technique and Data Collection
In line with the work of Peduzzi et al. (1996) (for sample
size consideration in logistic regression), a random
sampling technique was employed to select two hundred
(200) cassava farmers from the communities in the study
area. Questionnaires containing questions relating to the
level of cassava output together with potential factors that
might influence the level of the output were administered
to all the selected cassava farmers. The data obtained
provided information on the socioeconomic characteristics
of the cassava farmers, output or yield of cassava and
other factors such as farming experience, farm size,
sources of labour, source of farm power, control means of
pest and disease, credit facilities, and extension contacts.
Measurement of Variables
The study used data on technical coefficient (input-output)
of cassava production. The input factors include labour,
control means of pest and disease, credit facility,
extension services, land and socioeconomic factors. The
socioeconomic factors/variables were made up of the
farmer gender, age, level of education, marital status,
religious background and family size. The land variable
was the per total land area in acres cultivated by the
farmer, which indicates the size of the farm. Labour
calculations were based on the total number of people
employed to work on a given farm land in a particular crop
season. The educational level of the farmers was
determined by the number of years spent in school. Family
size was determined by the number of people living in the
household during the crop year. The output factor was the
level of cassava yield, which is the total cassava yield in
bags per acre per crop season. For example, if the
expected yield per acre is 10 bags of cassava during the
crop year, then below 5 bags was considered as a low
yield, whiles 5 bags and above was considered as high
yield.
Descriptive and Exploratory Data Analysis
The first stage of the analysis was used to gain an
understanding of the distributions of both continuous and
categorical variables.
For the continuous variables, a bivariate exploratory data
analysis was carried out to know if there was a relationship
between each continuous independent variable and the
categorical outcome variable.
The independent sample t test was used as an exploratory
tool. Like any exploratory analysis, the independent
sample t-test helped to determine whether it was worth
fitting a logistic regression model for the continuous
variables. A significant difference in mean was an
indication that, using a logistic regression model would be
the best as the results would be significant.
Variable Selection
The following steps were taken to select variables to enter
the Logistic Regression Model.
Step 1. Univariable Analyses
The univariate logistic regression was used to test the
association of each explanatory variable (one at a time)
with the outcome variable. This step helped to eliminate
insignificant variables from the model (i.e. variables that do
not show any significant association with the dependent
variable all by themselves) as such variables are not likely
to be associated with the outcome variable even after all
the other variables are added to the model).
The result of this univariate analysis includes: Wald and
likelihood ratio chi-square test statistics and their P-values;
parameter estimates and standard errors; and odds ratios
and their confidence limits. Each of these results were
considered.
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
J. Agric. Econ. Rural Devel. 596
Furthermore, since the values of the parameters for
logistic regression are calculated on a log scale, odd ratios
were examined. The odd ratios were calculated after
exponentiating the parameter estimates. An odds ratio
greater than one (>1) indicates a positive association, less
than one indicates (<1) indicates a negative association
and equal to one (=1) indicates no association of the tested
variable with the outcome.
Step 2. Multivariable Analyses
The next step was to carry out the multiple logistic
regression analysis on the selected independent variables.
At the end of the multiple logistic regression analysis,
those variables found to be insignificant were not included
in the final model.
Test for Parameters
After the multiple logistic regression analysis, the
importance of each explanatory variable was assessed by
carrying out statistical tests of the significance of the
coefficients. Parameter estimates and standard errors of
the variables in the model were assessed after addition or
deletion of a variable. This was done using the Wald and
likelihood ratio test statistics and their associated p-values.
The Wald statistic
Wald χ2 statistic was used to test the significance of
individual coefficients in the model. The statistic is
calculated as follows:
(
𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
𝑆𝐸 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡
)
2
Equation (13)
Each Wald statistic was compared to a χ2 distribution with
1 degree of freedom. Wald statistics are easy to calculate,
but their reliability is questionable, particularly for small
samples. For data that produce large estimates of the
coefficient, the standard error is often inflated, resulting in
a lower Wald statistic, and therefore the explanatory
variable may be incorrectly assumed to be unimportant in
the model. Likelihood ratio tests (see below) are generally
considered to be superior.
Likelihood ratio test: The likelihood ratio test for a
particular parameter compares the likelihood (L0) of
obtaining the data when the parameter is zero with the
likelihood (L1) of obtaining the data evaluated at the MLE
of the parameter. The test statistic is calculated as follows:
−2 × 𝑙𝑛(𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑟𝑎𝑡𝑖𝑜) = −2 × 𝑙𝑛(𝐿0/𝐿1) = −2 × (𝑙𝑛𝐿0 − 𝑙𝑛𝐿1)
Equation (14)
Measures of Fit for Logistic Regression Model
As already mentioned under the theoretical framework
section, before trusting the result of a model to make valid
conclusions or predict future outcomes, the model should
be checked beyond all reasonable doubt to make sure that
the model assumed is correctly specified and that the
data at hand does not conflict with assumptions made by
the model. In this work, the Hosmer and Lemeshow
Goodness of fit test was used to check whether the logistic
model assumed was correctly specified.
Model Discrimination
How well the model distinguished between the two groups
in the binary outcome in binary logistic regression was
assessed using the area under the receiver operating
characteristic (ROC) curve. This curve was obtained by
plotting sensitivity against specificity. The diagonal line
represents chance. A curve that is far above the diagonal
line shows that an indicator is accurate. This measure
varies between 0.5 and 1. An area of 0.5 represents the
diagonal, attained when no discrimination exists. An area
closer to 1 represents a good indicator. Whereas an area
of 1 represent a perfect indicator.
Measures of the Predictive Power of the Model
The R2 statistic for logistic Regression was used to
measure the predictive power of the model.
Test for Model Assumptions
In the case of binary logistic regression, the fact that the
probability lies between 0 and 1 imposes a constraint.
Therefore, both the assumptions of constant variance and
normality present in multiple linear regressions are lost.
However, like every statistical test, there are certain
assumptions that needed to be met if the result of the
multiple binary logistic regression model must be useful.
The model was checked to make sure that the data did
not fail those assumptions.
Multicolinearity
Multicollinearity occurs when the model includes multiple
independent variables that are correlated with each other.
This normally occurs when there are some independent
variables that are redundant. It is a type of disturbance that
may be present in the data. If this disturbance is not
eliminated from the data, any statistical inferences made
about the data may not be reliable. There are a number of
ways of detecting multicollinearity in a data set. Among
these are two collinearity diagnostic factors that can help
to identify multicollinearity. These are, the value of the
tolerance and its reciprocal, called variance inflation factor
(VIF). If the value of the tolerance is less than 0.2 or 0.1
and, simultaneously, the value of VIF 10 and above, then
multicollinearity is problematic.
The variable’s tolerance is 1 − R2
. Generally, a small
tolerance value indicates that the variable under
consideration is almost a perfect linear combination of the
independent variables already in the equation and that it
should not be added to the model.
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
Sesay and Koneh 597
Also, if the standard errors of the regression coefficients
are large, then multicollinearity is an issue.
In addition to the standard errors of the regression
coefficients, this work used the tolerance statistics and the
variance inflation factor (VIF) to test for multicollinearity.
Interaction
To test for interaction, the logistic regression analysis was
carried out with an interaction term, the p-value of the
regression output determined whether or not to include an
interaction term in the model. A significant p-value led to
the retention of the interaction term in the present model.
Influential Observation and Outliers
The final step was to find out if there were observations
that do not fit the model well (outliers), have strange values
for any variable (leverage) or that have undue influence on
the model (influence).
This ended the variable selection for the final model.
Final Model
After the variable selection stage, the next step was to fit
and assess the final logistic regression model. Most of the
diagnostic steps taken during the variable selection stage
were again applied to the final model. This was done to
ensure the appropriateness, adequacy and usefulness of
the final model upon which our conclusion was based.
EMPIRICAL ANALYSIS
Descriptive Statistic/ Exploratory data Analysis
Table 1. Descriptive Analysis: Dependent (DV) and Independent (IV) Variables to be Modeled
Variable Name IV/DV Valid Range Variable Type
Cassava Yield/Outcome DV High, Low Character, Categorical
Educational Level IV No Formal Education,
Primary School,
Secondary School,
Tech - Voc.
Character, Categorical
Gender IV Male, Female Numeric, Categorical
Land Owner IV Self, Communal, Lease, Rent Character, Categorical
Family Size IV 1 -17 Numeric, Categorical
Farm Size IV 1-10 acres Numeric, Continuous
Age IV 17-59 years Numeric, Continuous
Farming Experience IV 1-29 yesrs Numeric, Continuous
Source of Labour IV Family, haired, communal Numeric, Categorical
Pesticides IV Yes, No Numeric, Categorical
Credit Facility IV Yes, No Numeric, Categorical
Extension Services IV Yes, No Numeric Categorical
Descriptive Statistic For Categorical Variable
Table 2:Descriptive Statistics
N Range Minimum Maximum
EDUCATIONAL LEVEL 200 3 0 3
FAMILY SIZE 200 16 1 17
OWNERSHIP OF THE FARM LAND 200 4 0 4
SOURCES OF LABOUR 200 2 0 2
CREDIT FACILITIES 200 1 0 1
SOURCE OF FARM POWER 200 2 0 2
PESTS AND DISEASES CONTROL 200 0 0 0
Valid N (listwise) 200
Descriptive Statistic for Continuous Variable
Table 3: Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
FARM SIZE OF THE RESPOND 200 0 10 5.29 3.086
AGE OF RESPONDENT 200 17 59 41.55 8.689
FARMING EXPERIENCE OF RESPONDENT 200 1 29 14.51 8.460
Valid N (listwise) 200
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
J. Agric. Econ. Rural Devel. 598
Exploratory data Analysis
A bivariate exploratory analysis was carried out to know if
there was a relationship between the continuous
independent variables and the categorical outcome
variable. The independence sample t-test was used to
explore the relationship between each of the continuous
independent variables and the outcome variable, cassava
yield.
Like any statistical test, before using the independence
sample t-test, the common assumptions made when doing
a t-test were considered. The assumption of the t-test for
independent means focuses on sampling, research
design, measurement, population distributions and
population variance. The t-test for independent means is
considered typically robust for violations of the normal
distribution assumption (with a larger sample size). This
work used the QQ-plot to see if the assumption of
normality was satisfied before using the t-test for
independent means.
Quantile-Quantile (Q-Q) plot for continuous
independent variables
The Q-Q plots for the continuous variables are presented
in figure 1. The Q-Q plot is a graphical method for
comparing two probability distributions by plotting their
quantiles against each other. A concave departure from
the straight line in the Q-Q plot is an indication of a heavy
tailed distribution, whereas a convex departure is an
indication of a thin tail.
From the Q-Q plot in figure 1, it is evident that, the
distributions of the continuous independent variables are
not perfectly normally distributed. However, because of the
central Limit Theorem (sample size is greater than 30) and
the data was obtained randomly, the t-test was carried out.
Figure 1: QQ-plot of continuous variables
Independent Samples Test
The independent sample t-test was carried out for each
of the continuous independent variables, to determine if:
(1) there is a statistically significant difference in the mean
experience gained by cassava farmers with high
cassava yield and those with low cassava yield.
(2) there is a statistically significant difference in the mean
farm size used by cassava farmers with high cassava
yield and those with low cassava yield.
(3) there is a statistically significant difference in the mean
age of cassava farmers with high cassava yield and
those with low cassava yield.
The independent sample t-test acted as an exploratory
tool. Like all exploratory analysis, the independence
sample t-tests helped to determine if it is worth fitting a
logistic regression model for these variables or not. A
significant difference in mean, implies, running a logistic
regression would be the best, as the results would be
significant. Below are the outputs of the independent
sample t-tests for the continuous variables used in the final
model.
The significance level in the independence sample t-test
(in table 4) for farm size in relation to cassava yield is far
below the threshold significance level of 0.05. This means
that the mean difference in the farm size for those cassava
farmers with high cassava yield and those with low
cassava yield is statistically significant. This further implies
that, there is a relationship between farm size and cassava
yield. The logistic regression model was used to further
explore this relationship.
Similarly, the significance level in the independence
sample t-test, (presented in table 6) for farming experience
is far below the threshold significant level of 0.05. This
means that, the mean difference in the farming experience
of those cassava farmers with high yield and those with
low yield is statistically significant. This further implies that,
there is a relationship between farming experience and
cassava yield. The logistic regression model was used to
further explore this relationship
However, the significance level in the independence
sample t-test for the continuous variable, age is above the
threshold significance level of 0.05. This means that, the
difference in the mean age of those Cassava farmers with
high yield and those with low yield is not statistically
significant.
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
Sesay and Koneh 599
Table 4: Independent Samples Test
Levene's
Test for
Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig.
(2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence Interval of
the Difference
Lower Upper
FARM SIZE
OF THE
RESPOND
Equal
variances
assumed
2.130 .146 4.738 198 .000 1.979 .418 1.155 2.803
Equal
variances
not assumed
4.745 188.042 .000 1.979 .417 1.156 2.802
Table 5: Independent Samples Test
Levene's Test
for Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower Upper
AGE OF
RESPONDENT
Equal
variances
assumed
.070 .792 1.914 198 .057 2.353 1.230 -.072 4.778
Equal
variances
not
assumed
1.917 188.006 .057 2.353 1.228 -.069 4.775
Table 6: Independent Samples Test
Levene's
Test for
Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig.
(2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence Interval of
the Difference
Lower Upper
FARMING
EXPERIENCE
OF
RESPONDENT
Equal
variances
assumed
.456 .500 7.544 198 .000 8.033 1.065 5.933 10.133
Equal
variances
not
assumed
7.612 192.494 .000 8.033 1.055 5.952 10.115
Variable Selection
This involves two stages of analysis, the univariate stage
and the multivariable stage.
Univariate Analysis
This is the first stage of the variable selection procedure.
Each of the variables was investigated separately using
univariate logistic regression. Table 7 gives a combined
summary of all the univariate outputs.
From table 7, all the independent variables with p-values
less than the threshold value of 0.05 were found to be
significant and hence associated with the dependent
variable. At the second stage of the variable selection
procedure, all the significant independent variables were
further simultaneously investigated using the multivariable
logistic regression.
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
J. Agric. Econ. Rural Devel. 600
Table 7: P-Values and Odd Ratios of Independent
Variables from Univariate Analysis
Factor P-values (Wald
test)
P-values
(LR test)
Odd
Ratio
(OR)
Age 0.010 0.009 1.673
Educational Level 0.00 0.00 2.256
Family Size 0.331 0.307 0.410
Farm Experience 0.001 0.000 0.031
Land Owner 0.474 0.474 0.923
Farm Size 0.00 0.000 0.174
Source of Labour 0.00 0.00 0.137
Pesticides 0,00 0.00 0,171
Credit Facility 0.001 0.00 0.329
Extension Services 0.193 0.191 0.680
Gender 0.078 0.078 1.680
Multivariate Analysis
The multivariate output together with the goodness of fit
test result for the multivariate analysis are presented in
tables 8 and 9 respectively. From table 8, the Wald
statistic, p-values for some of the independent variables
are greater than the chosen significant threshold value of
0.05. The statistically significant independent variables
base on the p-values are: farming experience, educational
level, credit facility, source of labour and control means of
pest and disease. This implies that some of the variables
that entered the model during the multivariable analysis
stage were found to be insignificant. The Hosmer-
Lemeshow test of goodness of fit ( in Table 9) shows that,
at this multivariate analysis stage, the model is not a good
fit to the data as p=0.004<0.05.
In addition, due to further statistical investigation on each
of the statistically significant independent variables
mentioned below (Table 8), some of them did not enter the
final model. The reason being that, further statistical
investigations (tests) on these variables showed that some
of them influenced the outcome variable in such a way that
their inclusion in the model violates the assumption of ‘no
outlier. For example, when the variable, credit facility
entered the model as an independent variable with
extremely high significant value, the maximum of the
cook’s distance exceeded one (1). It even attained the
value of two (2) which is a clear violation of the assumption
of ‘no outlier’ or influential observation for the validity of the
result of the logistic regression model.
Some of the discoveries of the statistical investigations on
the independent variables are actually in line with reality.
For example, very few farmers have access to credit
facilities. The few that have access may tend to have big
farm lands, more laborers, and improved planting
materials leading to very high cassava yield/output. On the
other hand, some unfaithful cassava farmers may use the
money received from the credit to do something different
from the cassava production for which it was obtained
(credit facility’s odd Ratio <1, meaning higher credit grant
decreases the odds of cassava yield). So it was not
surprising to see that when credit facility entered the
equation, the incidence of influential /Outlier observation
was alarming. Nevertheless, we still acknowledge the fact
that credit facility is an extremely high determinant of high
or low level of cassava yield (outcome variable) in the
study area.
Table 8: Variables in the Equation
B S.E. Wald df Sig. Exp(B)
Step 1a FARMING_EXPERIENCE .104 .032 10.690 1 .001 1.109
AGE -.009 .026 .120 1 .729 .991
EDUCATIONAL 14.967 3 .002
EDUCATIONAL(1) -1.874 1.189 2.486 1 .115 .153
EDUCATIONAL(2) .304 1.329 .052 1 .819 1.355
EDUCATIONAL(3) -.181 1.250 .021 1 .885 .834
FARM_SIZE .127 .069 3.371 1 .066 1.136
SOURCES_OF_LABOUR 6.532 2 .038
SOURCES_OF_LABOUR(1) -1.370 .538 6.474 1 .011 .254
SOURCES_OF_LABOUR(2) -.490 .604 .658 1 .417 .612
CREDIT(1) -3.085 .661 21.803 1 .000 .046
CONTROL_MEAN(1) -1.816 .572 10.083 1 .001 .163
Constant 4.142 1.911 4.697 1 .030 62.952
a. Variable(s) entered on step 1: FARMING_EXPERIENCE, AGE, EDUCATIONAL, FARM_SIZE,
SOURCES_OF_LABOUR, CREDIT, CONTROL_MEAN.
Table 9: Hosmer and Lemeshow Test
Step Chi-square df Sig.
1 22.670 8 .004
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
Sesay and Koneh 601
Now that the significant independent variables in relation
to the output variable are selected, the next step was to fit
the final model for the logistic regression analysis.
Final Model
This is the last stage of the analysis. After the variables
have been selected from the first two stages of the logistic
regression modeling, the following analytical procedures
were taken to build and confirm the final model so as to
achieve our objective of identifying the main factors that
influence cassava productivity and to determine the effect
of each factor on cassava produced in the study area.
The categorical variable coding result presented in table
10 shows that majority of cassava farmers were illiterates
with no formal education.
Table 10: Categorical Variables Codings
Frequency
Parameter coding
(1) (2) (3)
EDUCATIONAL LEVEL OF RESPONDENT NO FORMAL EDUCATION 119 1.000 .000 .000
PRIMARY SCHOOL 16 .000 1.000 .000
SECONDARY SCHOOL 55 .000 .000 1.000
TECH - VOC. 10 .000 .000 .000
The model coefficients are contained in the column
headed B in Table 11. A negative coefficient means that
the Odd of increase in cassava yield decreases.
The output in Table 11 helped to identify the key
determinants of increase or decrease in cassave
productivity. That is, those independent variables that
contributed significantly to the level of cassava yield. It
also helped to determine how each determinant influenced
cassava yield.
From table 11, it is clear that among the independent
variables that entered the final model, farm size with
significance level (for Wald) that is far below the threshold
significance level of 0.05 is the main factor that influenced
the level of cassava yield. The odd ratio (Exp(B))
associated with farm size is 1.188 which is greater than
one (>1), meaning, an increase in farm size will increase
the probability of an increase in cassava yield. In other
words, the probability of high cassava yield occurring with
a unit (acre) increase in farm size is higher than at the
original farm size. Also, from table 11, educational level is
seen as a significant factor in determining the level of
cassava yield. It odd ratio (Exp(B)) is less than one (<1) for
all levels (no formal education, primary school, secondary
school and tech voc). This means that the probability of
high cassava yield with a unit increase in educational level
is lower than at original (or no increase). In other words,
the odds of increase in cassava yield is lower for farmers
with high educational level than for those with no or low
educational level. Lastly, from table 11, the interaction
term, farming experience by age is a highly significant
factor (with a significant level of 0.00) in determining the
level of cassava yield. It odd ratio is greater than one. This
implies that, the odds of an increase in cassava yield is
higher for older people with more farming experience than
for younger people with less farming experience. That is,
the probability of an increase in cassava yield is higher with
a unit (year) increase in age by farming experience than at
original.
Table 11: Variables in the Equation
B S.E. Wald df Sig. Exp(B)
95% C.I.for EXP(B)
Lower Upper
Step 1a EDUCATIONAL 21.921 3 .000
EDUCATIONAL(1) -2.099 1.217 2.975 1 .085 .123 .011 1.331
EDUCATIONAL(2) -.971 1.322 .539 1 .463 .379 .028 5.051
EDUCATIONAL(3) -.085 1.263 .005 1 .946 .918 .077 10.908
FARM_SIZE .172 .060 8.357 1 .004 1.188 1.057 1.335
AGE by FARMING_EXPERIENCE .003 .001 26.598 1 .000 1.003 1.002 1.004
Constant -.809 1.275 .402 1 .526 .445
a. Variable(s) entered on step 1: EDUCATIONAL, FARM_SIZE, AGE * FARMING_EXPERIENCE.
Model Checking
Chi-square goodness of fit test for model coefficients
The test in table 12 was used to check if the present (new)
model with explanatory variables included is an
improvement over the baseline model. This test uses the
chi-square test to see if there is a significant difference
between the Log-likelihoods of the baseline model and the
present model. A significantly reduced value of the Log-
likelihoods (-2LLs) suggests that the new model is
explaining more of the variation in the outcome variable
than the baseline model. In other words, a significantly
reduced value of the Log-likelihoods shows that the new
model is an improvement over the baseline model. From
Table 12, the chi-square statistic is highly significant (chi-
square=87.395 df=5, p<.000). This shows that, the present
(new) model is significantly better compared to the
baseline model.
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
J. Agric. Econ. Rural Devel. 602
Table 12: Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 1 Step 87.395 5 .000
Block 87.395 5 .000
Model 87.395 5 .000
From the classification table presented in Table 13, the
present logistic regression model correctly classified the
outcome for 77% of the cases.
Outcome Classification
Table 13: Classification Tablea
Observed
Predicted
OUTPUT OR
YIELD Percentage
CorrectLOW HIGH
Step
1
OUTPUT
OR YIELD
LOW 66 22 75.0
HIGH 24 88 78.6
Overall
Percentage
77.0
a. The cut value is .500
Model chi-square goodness of fit test
The hypothesis tested for the model goodness of fit were
stated as:
𝐻0: The model is a good fitting model.
𝐻 𝑎: The model is not a good fitting model.
From table 14, the tests of goodness of fit shows that, the
model is a good fit to the data as 𝑝 = 0.724 > .05
Table 14: Hosmer and Lemeshow Test
Step Chi-square df Sig.
1 5.309 8 .724
Measures of the Predictive Power of the Model
From the model summary result presented in table 15, it is
clear that, between 35% and 47% of the variation in
cassava yield was explained by the logistic regression
model.
Table 15: Model Summary
Step
-2 Log
likelihood
Cox & Snell R
Square
Nagelkerke R
Square
1 186.977a .354 .474
a. Estimation terminated at iteration number 5 because
parameter estimates changed by less than .001.
Influential Observation and Outliers
Again, it is good to find out if there are observations that
do not fit the model well (outliers), have strange values
(leverage) or that have undue influence on the model
(influential). In this work, the cook’s distance denoted as
Di, was used to find an influential predictor in the set of
predictor variables used in the analysis. In other words, it
was used to identify points that negatively affect the logistic
regression model. The measurement is a combination of
each observation’s leverage and residual values; the
higher the leverage and residuals, the higher the Cook’
distance. A Di value of more than 1 indicates that an
influential observation is present.
The maximum and minimum values of the Cook’s Distance
for our analysis are presented in the summary table (table
16) below.
From table 16, the maximum value of Di is 0.40012 which
is less than one (<1). Therefore, the issue of influential
observation or outlier is not alarming.
Table 16: Analog of Cook's influence statistics
N Valid 200
Missing 0
Mode .00028
Range .40010
Minimum .00003
Maximum .40012
MODEL DISCRIMINATION
How well the model distinguishes between the two groups
in the binary outcome in binary logistic regression was
assessed using the area under the receiver operating
characteristic (ROC) curve.
The Two basic measures of diagnostic accuracy are the
sensitivity and specificity (Zhou et al 2002). When
sensitivity is plotted against 1-specificity we obtained the
receiver operating characteristic (ROC) curve. The
diagonal line in the curve represents chance. The curve in
figure 2 is well above the diagonal line. In addition, from
table 16, the area under the curve (AUC) is 0.814. This
represents a high predictive accuracy of the chosen
model. In other words, an AUC value of 0.814 (which is
close to 1) indicates that the model reliably distinguished
between cassava farmers with high and low cassava
yields.
Figure 2: Receiver Operating Characteristic (ROC) curve
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
Sesay and Koneh 603
Table: 16: Area Under the Curve
Test Result Variable(s):
Area
.814
Multicolinearity
As already measured under the methodology section,
among the number of ways of detecting multicilinearity in
a data set, this work used the value of the tolerance and
its reciprocal, called the variance inflation factor (VIF) to
detect or identify multicollinearity in the data. The
variable’s tolerance is 1 − 𝑅2
. If the value of tolerance is
less than 0.2 or 0.1 and, simultaneously, the value of VIF
10 and above, then multicollinearity is problematic.
From our analysis, the highest value of, 𝑅2
which is the
Negelkerke, 𝑅2
is equal to 0.474. Hence the tolerance is
calculated as 1 − 𝑅2
= 1 − 0.474 = 0.526 and it VIF is
2.1097 (𝑖. 𝑒. 𝑉𝐼𝐹 =
1
0.474
= 2.1097) . The tolerance is far
above 0.1 and the value of VIF is far below 10. It is
therefore concluded that multicolinearity is not
problematic. In addition, the standard errors of the
coefficients are not too significant. This further suggested
that multicolinearity is not an issue here.
RESULTS AND DISCUSSION
A logistic regression analysis was carried out to find out
the main factors influencing cassava productivity in the
Moyamba District, southern province of Sierra Leone. The
level of cassava productivity was measured by the level
(high or low) of cassava yield. At the initial stage of the
analysis, many factors were considered as potential
determinants of cassava productivity in the study area.
However, further statistical investigation proved that some
of those factors were not significant determinants of a high
or low yield of cassava. Insignificant factors were dropped
out of the analysis. Variables (factors) that entered the
final model are: farm size, educational level and the
interaction term, age by farming experience.
At the final stage of the analysis, the logistic regression
model was significant, as the test of the full model against
a model with only the constant was significant. This shows
that the predictors as a set reliably distinguished between
a high and low yield of cassava (chi square = 87.395, p <
.05 with df=5). The model explained between 35% and
47% (Negelkerke R2 and Cox and snail R2 respectively) of
the variation in the cassava yield.
The Wald criteria showed that, among the independent
variables that entered the final model, farm size with a
significance level (for Wald) that is far below the threshold
significance level of 0.05 was the main factor that
influenced the level of cassava yield. The odd ratio
(Exp(B)) associated with farm size is 1.188 which is
greater than one (>1), meaning that, an increase in farm
size will increase the probability of high cassava yield. In
other words, the probability of high cassava yield occurring
with unit (acre) increase in farm size will be higher than at
the original farm size. This result is in conformity with the
research result documented by Ren et al (2019), that farm
size plays a critical role in agricultural sustainability.
Educational level was also shown to be a significant factor
in determining the level (high or low) of cassava yield.
However, in line with the view of mejority Sierra Leoneans,
that, subsistence farming is an option for those who failed
to go to school or droped out uf school, its odd ratio
(Exp(B)) is 0.123 which is less than one (<1). This means
that the probability of high cassava yield with unit increase
in educational level is lower than at original (no increase).
In other words, the odds of increase in cassava yield is
lower for higher educational level. This result is similar to
that obtained by Malte Reimers and Stephan Klasen
(2013) who detected insignificant or even surprisingly
negative effects of schooling on agricultural productivity
Finally, the interaction term, farming experience by age is
a high] y significant factor (with a significant level of 0.00)
in determining the level of cassava yield. Its odd ratio is
greater than one. This implies that the Odds of increase in
cassava yield are higher for older people with more
farming experience than for younger people with less
farming experience. In other word, the probability of an
increase in cassava yield is higher with a unit (year)
increase in age by experience. This is not surprising as
extension services for disseminating information on farm
technologies are not common in the rural areas. Farmers
only gain experience after long years of farming. A study
conducted by Gideon Danso-Abbeam et al (2018)
reaffirmed the critical role of extension programmes in
enhancing farm productivity and household income.
Credit facility, though, did not enter the final model (as it
exhibited an extreme behavior), was still recognized as a
significant determinant of high level of cassava yield. This
is because, the Wald p-value associated with credit facility
was significant at both the univariate and multivariable
stages of the variable selection in the logistic regression
modeling. This result is supported by Ekwere et al (2014),
in their book title, “Effects of agricultural credit facility on
the agricultural production and rural development, In their
book, they documented that, the independent variables;
loan size, farm size, and inputs explained the variation in
the total value of farmers output.
CONCLUSION
The purpose of this work was to identify the main factors
influencing cassava productivity and to determine the
effect of each factor on cassava yield/output. The
empirical evidence showed that, farm size, educational
level, and age by farming experience are the main factors
influencing cassava productivity in the study area. An
A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone
J. Agric. Econ. Rural Devel. 604
increase in farm size can increase cassava yield whiles an
increase in educational level can decrease cassava yield.
In fact, most of the cassava farmers were illiterates with no
formal education. Older people with more farming
experience contributed significantly to cassava production
in the study area.
.
REFERENCES
Ekwere et al 2014, Effects of agricultural credit facility on
the agricultural production and rural development,
International Journal of the Environment Vol.3 (2) 192-
204
Essers MA, DE Vrics-Smits LM, Barker N et all. (2005),
functional interaction between beta catenin and FOXO
in oxidative stress signalling. Science 308; 1181- 1184
FAO, (2003a). The state of food insecurity in the World:
monitoring progress towards the food summit and
millennium development goals: Rome, Italy: pp. 24-26.
FAO. (2003b). Cassava production data (2002).
(http:/www.fao.org).
FAO (2004). Proposals for a definition and methods of
analysis for dietary fibre content. CX/NFSDU 04/3 Add
1. Codex Committee on Nutrition and Foods for Special
Dietary Uses. Codex Alimentarius Commission.
Gideon Danso-Abbeam, Dennis Sedem Ehiakpor and
Robert Aidoo (2018), Agricultural extension and its
effects on farm productivity and income: insight from
Northern Ghana, Agriculture & Food Security
https://doi.org/10.1186/s40066-018-0225-
Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic
regression. New York: Wiley.
Malte Reimers and Stephan Klasen (2013), Revisiting the
Role of Education for Agricultural Productivity,
American Journal of Agricultural Economics, vol. 95,
issue 1, 131-152
Peduzzi P., Concato J., Kemper E, Holford T R, Feinstein
A.R. (1996). Simulation study of the number of events
per variable in logistic regression analysis. Journal of
Clinical Epidemiology.
Ren, Chenchen, Liu, Shen et al (2019), The impact of farm
size on agricultural sustainability, Journal of Cleaner
Production vol. 22
Sanni, L.O., Onadipe, P. Ilona, M.D. Mussagy, A. Abass,
and
A.G.O. Dixon, (2009). Successes and challenges of
cassava enterprises in West Africa: a case study of
Nigeria, Bénin,and Sierra Leone. IITA, Ibadan, Nigeria.
19 pp
Statistics Sierra Leone, (2015). Population and Housing
Census.
Zhou, X. H., Obuchowski, N. A., and Obushcowski, D. M.
(2002). Statistical methods in diagnostic medicine.
Wiley & Sons: New York.
Accepted 29 August 2019
Citation: Sesay RB, Koneh A (2019). A Logistic
Regression Model to Identify Factors Influencing Cassava
Productivity in the Southern Part of Sierra Leone. Journal
of Agricultural Economics and Rural Development, 5(2):
592-604.
Copyright: © 2019: Sesay and Koneh. This is an open-
access article distributed under the terms of the Creative
Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium,
provided the original author and source are cited.

More Related Content

Similar to A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone

Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
BRNSS Publication Hub
 
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
BRNSS Publication Hub
 
Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...
Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...
Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...
International Journal of Integrative sciences, Innovation and Technology (IJIIT) - AGSI
 
1-Manuscript_Template.docx
1-Manuscript_Template.docx1-Manuscript_Template.docx
1-Manuscript_Template.docx
AnonymouslK8PC1IrlO
 
1-Manuscript_Template.docx
1-Manuscript_Template.docx1-Manuscript_Template.docx
1-Manuscript_Template.docx
AnonymouslK8PC1IrlO
 
Analysis of food crop output volatility in agricultural policy programme regi...
Analysis of food crop output volatility in agricultural policy programme regi...Analysis of food crop output volatility in agricultural policy programme regi...
Analysis of food crop output volatility in agricultural policy programme regi...
Alexander Decker
 
Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...
Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...
Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...
Agriculture Journal IJOEAR
 
Organizational Behavior
Organizational BehaviorOrganizational Behavior
Organizational Behavior
guest5e0c7e
 
Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...
Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...
Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...
Premier Publishers
 
The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...
The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...
The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...
Premier Publishers
 
Effects Of Khat Production On Rural Household’s Income In.pdf
Effects Of Khat Production On Rural Household’s Income In.pdfEffects Of Khat Production On Rural Household’s Income In.pdf
Effects Of Khat Production On Rural Household’s Income In.pdf
Nadhi2
 
Ppt measuring small farmers’ vulnerability to climate change
Ppt measuring small farmers’ vulnerability to climate changePpt measuring small farmers’ vulnerability to climate change
Ppt measuring small farmers’ vulnerability to climate change
Yohannes Mengesha, PhD Fellow
 

Similar to A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone (20)

An Analysis of Production and Sales of Choerospondias Axillaris
An Analysis of Production and Sales of Choerospondias AxillarisAn Analysis of Production and Sales of Choerospondias Axillaris
An Analysis of Production and Sales of Choerospondias Axillaris
 
1 ijhaf aug-2017-3-long run analysis of the carrying
1 ijhaf aug-2017-3-long run analysis of the carrying1 ijhaf aug-2017-3-long run analysis of the carrying
1 ijhaf aug-2017-3-long run analysis of the carrying
 
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
 
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
Effect of Remittance on Farmer’s Livelihood: A Case of Sundarbazar Municipali...
 
Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...
Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...
Mathematical modelling of Fish Resources Harvesting with Predator at Maximum ...
 
1-Manuscript_Template.docx
1-Manuscript_Template.docx1-Manuscript_Template.docx
1-Manuscript_Template.docx
 
1-Manuscript_Template.docx
1-Manuscript_Template.docx1-Manuscript_Template.docx
1-Manuscript_Template.docx
 
Farming systems analysis and modelling in the Livestock Systems and Environm...
Farming systems analysis and modelling in the  Livestock Systems and Environm...Farming systems analysis and modelling in the  Livestock Systems and Environm...
Farming systems analysis and modelling in the Livestock Systems and Environm...
 
Analysis of food crop output volatility in agricultural policy programme regi...
Analysis of food crop output volatility in agricultural policy programme regi...Analysis of food crop output volatility in agricultural policy programme regi...
Analysis of food crop output volatility in agricultural policy programme regi...
 
Determinants of farmers’ willingness to pay for irrigation water use: the cas...
Determinants of farmers’ willingness to pay for irrigation water use: the cas...Determinants of farmers’ willingness to pay for irrigation water use: the cas...
Determinants of farmers’ willingness to pay for irrigation water use: the cas...
 
Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...
Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...
Efficiency Production Cost of Goat Farming in the Lowland and the Highland Ar...
 
Organizational Behavior
Organizational BehaviorOrganizational Behavior
Organizational Behavior
 
Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...
Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...
Perception and Trend Analysis of Climate Change in Chepang and Non-Chepang Fa...
 
Assessment of sustainable food security
Assessment of sustainable food securityAssessment of sustainable food security
Assessment of sustainable food security
 
Yield Forecasting to Sustain the Agricultural Transportation UnderStochastic ...
Yield Forecasting to Sustain the Agricultural Transportation UnderStochastic ...Yield Forecasting to Sustain the Agricultural Transportation UnderStochastic ...
Yield Forecasting to Sustain the Agricultural Transportation UnderStochastic ...
 
The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...
The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...
The Effectiveness of Communication Channels for the Uptake of Modern Reproduc...
 
Effects Of Khat Production On Rural Household’s Income In.pdf
Effects Of Khat Production On Rural Household’s Income In.pdfEffects Of Khat Production On Rural Household’s Income In.pdf
Effects Of Khat Production On Rural Household’s Income In.pdf
 
Livestock-Climate Change CRSP Annual Meeting 2011: GSFA/RIVERS Project Update...
Livestock-Climate Change CRSP Annual Meeting 2011: GSFA/RIVERS Project Update...Livestock-Climate Change CRSP Annual Meeting 2011: GSFA/RIVERS Project Update...
Livestock-Climate Change CRSP Annual Meeting 2011: GSFA/RIVERS Project Update...
 
Ppt measuring small farmers’ vulnerability to climate change
Ppt measuring small farmers’ vulnerability to climate changePpt measuring small farmers’ vulnerability to climate change
Ppt measuring small farmers’ vulnerability to climate change
 
CASHEW PROFITABILITY-PUBLISHED
CASHEW PROFITABILITY-PUBLISHEDCASHEW PROFITABILITY-PUBLISHED
CASHEW PROFITABILITY-PUBLISHED
 

More from Premier Publishers

Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...
Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...
Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...
Premier Publishers
 
Impact of Provision of Litigation Supports through Forensic Investigations on...
Impact of Provision of Litigation Supports through Forensic Investigations on...Impact of Provision of Litigation Supports through Forensic Investigations on...
Impact of Provision of Litigation Supports through Forensic Investigations on...
Premier Publishers
 
Urban Liveability in the Context of Sustainable Development: A Perspective fr...
Urban Liveability in the Context of Sustainable Development: A Perspective fr...Urban Liveability in the Context of Sustainable Development: A Perspective fr...
Urban Liveability in the Context of Sustainable Development: A Perspective fr...
Premier Publishers
 
Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...
Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...
Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...
Premier Publishers
 
Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...
Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...
Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...
Premier Publishers
 
The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...
The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...
The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...
Premier Publishers
 
Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...
Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...
Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...
Premier Publishers
 
Performance evaluation of upland rice (Oryza sativa L.) and variability study...
Performance evaluation of upland rice (Oryza sativa L.) and variability study...Performance evaluation of upland rice (Oryza sativa L.) and variability study...
Performance evaluation of upland rice (Oryza sativa L.) and variability study...
Premier Publishers
 
Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...
Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...
Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...
Premier Publishers
 
Influence of Conferences and Job Rotation on Job Productivity of Library Staf...
Influence of Conferences and Job Rotation on Job Productivity of Library Staf...Influence of Conferences and Job Rotation on Job Productivity of Library Staf...
Influence of Conferences and Job Rotation on Job Productivity of Library Staf...
Premier Publishers
 

More from Premier Publishers (20)

Evaluation of Agro-morphological Performances of Hybrid Varieties of Chili Pe...
Evaluation of Agro-morphological Performances of Hybrid Varieties of Chili Pe...Evaluation of Agro-morphological Performances of Hybrid Varieties of Chili Pe...
Evaluation of Agro-morphological Performances of Hybrid Varieties of Chili Pe...
 
An Empirical Approach for the Variation in Capital Market Price Changes
An Empirical Approach for the Variation in Capital Market Price Changes An Empirical Approach for the Variation in Capital Market Price Changes
An Empirical Approach for the Variation in Capital Market Price Changes
 
Influence of Nitrogen and Spacing on Growth and Yield of Chia (Salvia hispani...
Influence of Nitrogen and Spacing on Growth and Yield of Chia (Salvia hispani...Influence of Nitrogen and Spacing on Growth and Yield of Chia (Salvia hispani...
Influence of Nitrogen and Spacing on Growth and Yield of Chia (Salvia hispani...
 
Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...
Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...
Enhancing Social Capital During the Pandemic: A Case of the Rural Women in Bu...
 
Impact of Provision of Litigation Supports through Forensic Investigations on...
Impact of Provision of Litigation Supports through Forensic Investigations on...Impact of Provision of Litigation Supports through Forensic Investigations on...
Impact of Provision of Litigation Supports through Forensic Investigations on...
 
Improving the Efficiency of Ratio Estimators by Calibration Weightings
Improving the Efficiency of Ratio Estimators by Calibration WeightingsImproving the Efficiency of Ratio Estimators by Calibration Weightings
Improving the Efficiency of Ratio Estimators by Calibration Weightings
 
Urban Liveability in the Context of Sustainable Development: A Perspective fr...
Urban Liveability in the Context of Sustainable Development: A Perspective fr...Urban Liveability in the Context of Sustainable Development: A Perspective fr...
Urban Liveability in the Context of Sustainable Development: A Perspective fr...
 
Transcript Level of Genes Involved in “Rebaudioside A” Biosynthesis Pathway u...
Transcript Level of Genes Involved in “Rebaudioside A” Biosynthesis Pathway u...Transcript Level of Genes Involved in “Rebaudioside A” Biosynthesis Pathway u...
Transcript Level of Genes Involved in “Rebaudioside A” Biosynthesis Pathway u...
 
Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...
Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...
Multivariate Analysis of Tea (Camellia sinensis (L.) O. Kuntze) Clones on Mor...
 
Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...
Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...
Causes, Consequences and Remedies of Juvenile Delinquency in the Context of S...
 
The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...
The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...
The Knowledge of and Attitude to and Beliefs about Causes and Treatments of M...
 
Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...
Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...
Effect of Phosphorus and Zinc on the Growth, Nodulation and Yield of Soybean ...
 
Influence of Harvest Stage on Yield and Yield Components of Orange Fleshed Sw...
Influence of Harvest Stage on Yield and Yield Components of Orange Fleshed Sw...Influence of Harvest Stage on Yield and Yield Components of Orange Fleshed Sw...
Influence of Harvest Stage on Yield and Yield Components of Orange Fleshed Sw...
 
Performance evaluation of upland rice (Oryza sativa L.) and variability study...
Performance evaluation of upland rice (Oryza sativa L.) and variability study...Performance evaluation of upland rice (Oryza sativa L.) and variability study...
Performance evaluation of upland rice (Oryza sativa L.) and variability study...
 
Response of Hot Pepper (Capsicum Annuum L.) to Deficit Irrigation in Bennatse...
Response of Hot Pepper (Capsicum Annuum L.) to Deficit Irrigation in Bennatse...Response of Hot Pepper (Capsicum Annuum L.) to Deficit Irrigation in Bennatse...
Response of Hot Pepper (Capsicum Annuum L.) to Deficit Irrigation in Bennatse...
 
Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...
Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...
Harnessing the Power of Agricultural Waste: A Study of Sabo Market, Ikorodu, ...
 
Influence of Conferences and Job Rotation on Job Productivity of Library Staf...
Influence of Conferences and Job Rotation on Job Productivity of Library Staf...Influence of Conferences and Job Rotation on Job Productivity of Library Staf...
Influence of Conferences and Job Rotation on Job Productivity of Library Staf...
 
Scanning Electron Microscopic Structure and Composition of Urinary Calculi of...
Scanning Electron Microscopic Structure and Composition of Urinary Calculi of...Scanning Electron Microscopic Structure and Composition of Urinary Calculi of...
Scanning Electron Microscopic Structure and Composition of Urinary Calculi of...
 
Gentrification and its Effects on Minority Communities – A Comparative Case S...
Gentrification and its Effects on Minority Communities – A Comparative Case S...Gentrification and its Effects on Minority Communities – A Comparative Case S...
Gentrification and its Effects on Minority Communities – A Comparative Case S...
 
Oil and Fatty Acid Composition Analysis of Ethiopian Mustard (Brasicacarinata...
Oil and Fatty Acid Composition Analysis of Ethiopian Mustard (Brasicacarinata...Oil and Fatty Acid Composition Analysis of Ethiopian Mustard (Brasicacarinata...
Oil and Fatty Acid Composition Analysis of Ethiopian Mustard (Brasicacarinata...
 

Recently uploaded

Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
EADTU
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
Peter Brusilovsky
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
EADTU
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
中 央社
 

Recently uploaded (20)

MOOD STABLIZERS DRUGS.pptx
MOOD     STABLIZERS           DRUGS.pptxMOOD     STABLIZERS           DRUGS.pptx
MOOD STABLIZERS DRUGS.pptx
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes GuàrdiaPersonalisation of Education by AI and Big Data - Lourdes Guàrdia
Personalisation of Education by AI and Big Data - Lourdes Guàrdia
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
SPLICE Working Group: Reusable Code Examples
SPLICE Working Group:Reusable Code ExamplesSPLICE Working Group:Reusable Code Examples
SPLICE Working Group: Reusable Code Examples
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...Andreas Schleicher presents at the launch of What does child empowerment mean...
Andreas Schleicher presents at the launch of What does child empowerment mean...
 
Book Review of Run For Your Life Powerpoint
Book Review of Run For Your Life PowerpointBook Review of Run For Your Life Powerpoint
Book Review of Run For Your Life Powerpoint
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
ESSENTIAL of (CS/IT/IS) class 07 (Networks)
ESSENTIAL of (CS/IT/IS) class 07 (Networks)ESSENTIAL of (CS/IT/IS) class 07 (Networks)
ESSENTIAL of (CS/IT/IS) class 07 (Networks)
 
e-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopale-Sealing at EADTU by Kamakshi Rajagopal
e-Sealing at EADTU by Kamakshi Rajagopal
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio App
 
How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17
 

A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone

  • 1. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone *1Regina Baby Sesay, 2Ahmed Koneh 1,2Department of Mathematics and Statistics, School of Technology, Njala University, Njala, Sierra Leone The role of cassava as food and cash crop in Sierra Leone has contributed immensely to the country's economic development. This includes providing employment facilities for Sierra Leoneans. Cassava is the second largest food crop grown across the country. Despite its importance and tremendous contributions to the country's economic development, its production faces several constraints. This work, therefore, focused on using a statistical modeling technique to key out the major factors influencing cassava productivity in the southern part of Sierra Leone. It further measured the effect of each factor on cassava productivity. A multiple binary logistic regression modeling technique were used in the empirical analysis. Two hundred cassava farmers were randomly selected from the communities in the study area. Cassava productivity was measured by the level of cassava yield. Initially, several factors were considered as possible determinants of the level of cassava yield. However, the empirical analysis showed that farm size, educational level, and age by farming experience are the main factors influencing cassava productivity in the study area. Increase in farm size can increase cassava yield whiles an increase in educational level may decrease cassava productivity. Older people with more farming experience can contribute significantly to cassava productivity. Key words: Predictors, Sensitivity, Farmers, Yield, Sierra Leone INTRODUCTION Cassava with a botanical name Manihot esculenta Crantz originated from South America. It is extensively propagated as an annual crop in the tropical and subtropical regions for its edible starchy tuber (FAO, 2003a). Cassava is a perennial shrub grown throughout lowland tropical regions. The role of cassava as food and cash crop has contributed immensely to the economic development of Sierra Leone. Cassava is the second largest food crop grown across the country, with an annual yield of 350,000 tons in 2006 (Sanni et al., 2009). The main areas of production are the South-West, central and far north of the Country. It is one of the most important food crops in Sierra Leone as it serves as a major source of carbohydrate (FAO, 2004) According to FAO estimates, 172 million tons of cassava were produced worldwide in 2000. Africa accounted for 54% (FAO, 2003b). Also, in 2002, world production of cassava tuber was estimated to be 184 million tons, the majority of production was in Africa, where 99.1 million tons were grown (FAO, 2003b). In Sierra Leone, the significance of cassava cannot be overemphasized as it stands out to be the main supplement to rice, which is a well-known staple food for Sierra Leoneans. Nearly 90% of cassava produced is for human consumption; less than 10% are semi-processed for on-farm animal feed (Sanni et al. 2009). This is clearly seen in the provinces during the raining season, as the demand for food shifts from the staple food, rice to cassava due to an increase in the price of rice. Moreover, annual population growth is about 2.8% in most West African countries, while urban growth is generally significantly higher than rural growth. An annual urban growth rate of 5% for a 10-year period implies a 63% *Corresponding Author: Regina Baby Sesay, Department of Mathematics and Statistics, School of Technology, Njala University, Njala, Sierra Leone. E-mail: regisesay@yahoo.com; Tel: +23279235912. Co-Author Email: ahmedkonneh@gmail.com, Tel: +23279782822 Research Article Vol. 5(2), pp. 592-604, September, 2019. © www.premierpublishers.org, ISSN: 2167-0477 Journal of Agricultural Economics and Rural Development
  • 2. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone Sesay and Koneh 593 increase in the urban population and the demand for food (Essers et al. 2005). To feed the urban dwellers, food supply from every farm household has to increase by at least 63% in 10 years (Sanni et al. 2009). This clearly points out the necessity for an increase in the growth of a supplementary food crop like Cassava. Despite its importance and tremendous contributions, cassava production in Sierra Leone faces several constraints. Some of these constraints are: inadequate funding; lack of farming experience; lack of availability of land for farming and the educational level of cassava farmers. This study, therefore, aims to key out the major factors influencing cassava productivity in the Moyamba District, Southern Province of Sierra Leone. It used a logistic regression modeling technique to identify the key determinants of cassava productivity and to measure the effect of each determinant on the yield of cassava grown in the study area. MATERIALS AND METHODS Theoretical Frameworks This section focuses on the review of the theoretical and conceptual frameworks of using a logistic regression method for analyzing categorical outcome. It also points out the main statistics used in the logistics regression model checking. Logistic Regression Regression analysis is a predictive modeling technique. It investigates and estimates the relationship between a variable of interest called the dependent or target variable and one or more variables that may have an influence on the dependent variable called predictor(s). Based on the type of dependent variable(s), the number of independent variables and shape of the regression line, there exist different regression techniques used to investigate relevant relationships and to make valuable predictions. Among these numerous regression techniques, this work used a multiple binary logistic regression modeling technique to investigate the factors influencing cassava production in the Moyamba District. Cassava productivity was measured by the level of cassava yield. A multiple binary logistic regression was used, multiple’ because there were over one independent variable, ‘binary’ because the variable of interest, called the dependent variable was dichotomous (high or low yield) and ‘logistic’ because of lack of linearity between the dependent variable and the independent variable (s). In building the logistic regression model to achieve the purpose of this research work, the following concepts and statistics were considered: The Binomial Distribution The binomial distribution is appropriate to use as an error distribution in logistic regression because: 1. the outcome of interest is dichotomous (a success or a failure); and 2. a number of independent trials are considered. Let: 𝑦𝑖 = { 1 𝑖𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ 𝑓𝑎𝑟𝑚′𝑠 cassava 𝑦𝑖𝑒𝑙𝑑 𝑖𝑠 ℎ𝑖𝑔ℎ 0 𝑖𝑓 𝑡ℎ𝑒 𝑖 𝑡ℎ 𝑓𝑎𝑟𝑚′𝑠 cassava 𝑦𝑖𝑒𝑙𝑑 𝑖𝑠 𝑙𝑜𝑤 Equation (1) where 𝑦𝑖 is the level of the yield for farm i. Here, 𝑦𝑖 is considered as a realization of a random variable 𝑌𝑖 that can take the values one and zero with probabilities 𝑝𝑖 and 1 − 𝑝𝑖 respectively. The distribution of 𝑌𝑖 is called a bernoulli distribution with parameter 𝑝𝑖 and can be written as 𝑝𝑟(𝑌𝑖 = 𝑦𝑖) = 𝑝𝑖 𝑦 𝑖 (1 − 𝑝𝑖)1−𝑦 𝑖 Equation (2) for 𝑦𝑖 = 0,1. If 𝑦𝑖 = 1 𝑝𝑖 is obtained, and if 𝑦𝑖 = 0 1 − 𝑝𝑖 is obtained. Logistic Regression Model From the above discussion of the binomial distribution, the logistic regression model can be understood as a means of finding the 𝛽 parameters that best fit: 𝑦𝑖 = { 1 β0 + β1x + ε > 0 0 𝑒𝑙𝑠𝑒 Equation (3) Where 𝜀 is an error term In short, if 𝑝̂ is the predicted probability that 𝑌 = 1, given the values of 𝑥1, … , 𝑥 𝑘, the model assumes that log 𝑝̂ (1−𝑝̂) = 𝛽0 + 𝛽1 𝑥1+, … , 𝛽 𝑝 𝑥 𝑘 Equation (4) Where 𝑌~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙(𝑝̂) Parameter Interpretation Unlike the simple linear model, 𝑌 = 𝛽0 + 𝛽1 𝑥1 indicating that if x increases by 1, Y increases by .𝛽1 , in a logistic regression model, it is log 𝑝̂ (1−𝑝̂) which increases by .𝛽1. To see this,let the predicted probability of the event of interest be 𝑝0 when 𝑥 = 0 and 𝑝̂1 when 𝑥 = 1, then log 𝑃̂0 (1 − 𝑃̂0) = 𝛽0
  • 3. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone J. Agric. Econ. Rural Devel. 594 log 𝑝̂1 1 − 𝑝̂1 = 𝛽0 + 𝛽1 log 𝑝̂1 1 − 𝑝̂1 = log 𝑝̂0 1 − 𝑝̂0 + 𝛽1 Taking exponent on both sides of this equation we have: 𝑒 log( 𝑝1 1−𝑝̂1 ) = 𝑒 log 𝑝̂0 1−𝑝̂0 +𝛽1 This gives 𝑝1 1−𝑝̂1 = 𝑝̂0 1−𝑝̂0 × 𝑒 𝛽1 Equation (5) This means, when x increases by 1, the odds of a positive outcome increase by a factor of 𝑒1 𝛽 . Therefore, 𝑒1 𝛽 is called the odds ratio for a unit increase in x. To be specific, the odd ratio for a continuous independent variable, 𝑂𝑅 𝑐 can be defined as: 𝑂𝑅 𝑐 = 𝑜𝑑𝑑𝑠(𝑥+1) 𝑜𝑑𝑑𝑠(𝑥) = 𝐹(𝑥+1) 1−𝐹(𝑥+1) 𝐹(𝑥) 1−𝐹(𝑥) = 𝑒 𝛽0+𝛽1(𝑥+1) 𝑒 𝛽0+𝛽1 𝑥 = 𝑒 𝛽1 Equation (6) In case of a binary independent variable, the odds ratio can be define as 𝑎𝑑 𝑏𝑐 , where a, b, c and d are cells in a 2×2 contingency table Measures of fit for Logistic Regression Like any classical linear model, a vital part of logistic regression analysis is how well the model fits the Data. Before trusting the result of a model to make valid conclusions or predict future outcomes, it is important to check the model beyond all reasonable doubt to make sure that the model assumed is correctly specified and the data at hand do not conflict with assumptions made by the model. The residuals or differences between observed and fitted values were the raw materials used in these tests. Deviance Goodness-of-Fit Test The deviance goodness-of-fit test assesses the discrepancy between the current model and the full model. The deviance statistic denoted as D2 is thus; 𝐷2 = 2 log Ls (β̂) − log Lm (β̂) Equation (7) where log Lm(β̂) = maximized log-likelihood of the fitted model log Ls(β̂) = maximized log-likelihood of the saturated model Evidence for model lack-of-fit occurs when the value of D2 is large Pearson Goodness-of-Fit Test The Pearson goodness-of-fit test also assesses the discrepancy between the current model and the full model. The test-statistic is: 𝜒2 = ∑ (Oi−Ei)2 Ei 𝑛 𝑗=1 = N ∑ ( Oi N ⁄ −pi) 2 pi n i=1 Equation (8) where 𝜒2 = Pearson's cumulative test statistic, which asymptotically approaches a 𝜒2 distribution. Oj == the number of observations of type j. N= total number of observations Ej = NPj = the expected (theoretical) frequency of type j, asserted by the null hypothesis that the fraction of type j in the population is pj nj = the number of cells in the table. Hosmer Lemeshow This goodness-of-fit test was used to determine whether the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution does not predict. If the p-value for the goodness-of-fit test is lower than the chosen significance level, the predicted probabilities deviate from the observed probabilities in a way that the binomial distribution cannot predict. Hosmer and Lemeshow (2000) recommended partitioning the observations into 10 equal sized groups according to their predicted probabilities. So that 𝐺 𝐻𝐿 2 = ∑ (𝑂 𝐽−𝐸 𝐽) 2 𝐸 𝐽(1−𝐸 𝑗/𝑛 𝑗) 10 𝑗=1 ~𝜒8 2 Equation (9) 𝑛𝐽 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑗 𝑡ℎ groug 𝑂𝑗 = ∑ 𝑦𝑖𝑗𝑖 = Observed number cases in the 𝑗 𝑡ℎ groug 𝐸𝑗 =expected number of cases in the 𝑗 𝑡ℎ group Measures of the Predictive Power of the Logistic Regression Model The R2 statistics for logistic regression was used to measure the predictive power of the model. There are different versions of R2 in the statistics literature, but this work used the Nagelkerke and Cox and Snell R2 Squares produced by SPSS. In using R2, adding any variable may tend to increase it's value, even if that variable is irrelevant. For this reason, the adjusted R2 is preferably used to access the predictive power of the logistic legression model. Cox and Snell R2 : It’s sometimes referred to as a “pseudo” R2 . The Cox and Snell R2 is
  • 4. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone Sesay and Koneh 595 𝑅 𝐶&𝑆 2 = 1 − ( 𝐿 𝑂 𝐿 𝑀 ) 2 𝑛 Equation (10) where n is the sample size Nagelkerke R Square: The Nagelkerke R Square adjusts the Cox and Snell’s R Square so that the range of possible values extends to 1. This is achieved by dividing the Cox and Snell R-squared by its maximum possible value, 1 − 𝐿(𝑀𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡) 2 𝑁⁄ Equation (11) So, if the full model perfectly predicts the outcome and has a likelihood of 1, Nagelkerke R-squared = 1. This implies, When L(Mfull) = 1, R2 = 1 and when L(Mfull) = L(Mintercept), R2 = 0. The Nagelkerke R Square is: Equation (12) Methodology This section introduces stages involved in the data analysis. It also points out the type of data analysis adopted at each stage together with the need for each analysis. Description of study area This study was carried out in the Moyamba District, the southern part of Sierra Leone with. a population of 318,064 in the 2015 population and housing census (Statistics Sierra Leone, 2015). Moyamba District has a seasonal variation like any other parts of the country. It has a rainy season that starts in May and ends in October and a dry season that starts in November and ends in April. One of the main occupations of people living in this part of the country is farming. Sampling Technique and Data Collection In line with the work of Peduzzi et al. (1996) (for sample size consideration in logistic regression), a random sampling technique was employed to select two hundred (200) cassava farmers from the communities in the study area. Questionnaires containing questions relating to the level of cassava output together with potential factors that might influence the level of the output were administered to all the selected cassava farmers. The data obtained provided information on the socioeconomic characteristics of the cassava farmers, output or yield of cassava and other factors such as farming experience, farm size, sources of labour, source of farm power, control means of pest and disease, credit facilities, and extension contacts. Measurement of Variables The study used data on technical coefficient (input-output) of cassava production. The input factors include labour, control means of pest and disease, credit facility, extension services, land and socioeconomic factors. The socioeconomic factors/variables were made up of the farmer gender, age, level of education, marital status, religious background and family size. The land variable was the per total land area in acres cultivated by the farmer, which indicates the size of the farm. Labour calculations were based on the total number of people employed to work on a given farm land in a particular crop season. The educational level of the farmers was determined by the number of years spent in school. Family size was determined by the number of people living in the household during the crop year. The output factor was the level of cassava yield, which is the total cassava yield in bags per acre per crop season. For example, if the expected yield per acre is 10 bags of cassava during the crop year, then below 5 bags was considered as a low yield, whiles 5 bags and above was considered as high yield. Descriptive and Exploratory Data Analysis The first stage of the analysis was used to gain an understanding of the distributions of both continuous and categorical variables. For the continuous variables, a bivariate exploratory data analysis was carried out to know if there was a relationship between each continuous independent variable and the categorical outcome variable. The independent sample t test was used as an exploratory tool. Like any exploratory analysis, the independent sample t-test helped to determine whether it was worth fitting a logistic regression model for the continuous variables. A significant difference in mean was an indication that, using a logistic regression model would be the best as the results would be significant. Variable Selection The following steps were taken to select variables to enter the Logistic Regression Model. Step 1. Univariable Analyses The univariate logistic regression was used to test the association of each explanatory variable (one at a time) with the outcome variable. This step helped to eliminate insignificant variables from the model (i.e. variables that do not show any significant association with the dependent variable all by themselves) as such variables are not likely to be associated with the outcome variable even after all the other variables are added to the model). The result of this univariate analysis includes: Wald and likelihood ratio chi-square test statistics and their P-values; parameter estimates and standard errors; and odds ratios and their confidence limits. Each of these results were considered.
  • 5. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone J. Agric. Econ. Rural Devel. 596 Furthermore, since the values of the parameters for logistic regression are calculated on a log scale, odd ratios were examined. The odd ratios were calculated after exponentiating the parameter estimates. An odds ratio greater than one (>1) indicates a positive association, less than one indicates (<1) indicates a negative association and equal to one (=1) indicates no association of the tested variable with the outcome. Step 2. Multivariable Analyses The next step was to carry out the multiple logistic regression analysis on the selected independent variables. At the end of the multiple logistic regression analysis, those variables found to be insignificant were not included in the final model. Test for Parameters After the multiple logistic regression analysis, the importance of each explanatory variable was assessed by carrying out statistical tests of the significance of the coefficients. Parameter estimates and standard errors of the variables in the model were assessed after addition or deletion of a variable. This was done using the Wald and likelihood ratio test statistics and their associated p-values. The Wald statistic Wald χ2 statistic was used to test the significance of individual coefficients in the model. The statistic is calculated as follows: ( 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑆𝐸 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 ) 2 Equation (13) Each Wald statistic was compared to a χ2 distribution with 1 degree of freedom. Wald statistics are easy to calculate, but their reliability is questionable, particularly for small samples. For data that produce large estimates of the coefficient, the standard error is often inflated, resulting in a lower Wald statistic, and therefore the explanatory variable may be incorrectly assumed to be unimportant in the model. Likelihood ratio tests (see below) are generally considered to be superior. Likelihood ratio test: The likelihood ratio test for a particular parameter compares the likelihood (L0) of obtaining the data when the parameter is zero with the likelihood (L1) of obtaining the data evaluated at the MLE of the parameter. The test statistic is calculated as follows: −2 × 𝑙𝑛(𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑟𝑎𝑡𝑖𝑜) = −2 × 𝑙𝑛(𝐿0/𝐿1) = −2 × (𝑙𝑛𝐿0 − 𝑙𝑛𝐿1) Equation (14) Measures of Fit for Logistic Regression Model As already mentioned under the theoretical framework section, before trusting the result of a model to make valid conclusions or predict future outcomes, the model should be checked beyond all reasonable doubt to make sure that the model assumed is correctly specified and that the data at hand does not conflict with assumptions made by the model. In this work, the Hosmer and Lemeshow Goodness of fit test was used to check whether the logistic model assumed was correctly specified. Model Discrimination How well the model distinguished between the two groups in the binary outcome in binary logistic regression was assessed using the area under the receiver operating characteristic (ROC) curve. This curve was obtained by plotting sensitivity against specificity. The diagonal line represents chance. A curve that is far above the diagonal line shows that an indicator is accurate. This measure varies between 0.5 and 1. An area of 0.5 represents the diagonal, attained when no discrimination exists. An area closer to 1 represents a good indicator. Whereas an area of 1 represent a perfect indicator. Measures of the Predictive Power of the Model The R2 statistic for logistic Regression was used to measure the predictive power of the model. Test for Model Assumptions In the case of binary logistic regression, the fact that the probability lies between 0 and 1 imposes a constraint. Therefore, both the assumptions of constant variance and normality present in multiple linear regressions are lost. However, like every statistical test, there are certain assumptions that needed to be met if the result of the multiple binary logistic regression model must be useful. The model was checked to make sure that the data did not fail those assumptions. Multicolinearity Multicollinearity occurs when the model includes multiple independent variables that are correlated with each other. This normally occurs when there are some independent variables that are redundant. It is a type of disturbance that may be present in the data. If this disturbance is not eliminated from the data, any statistical inferences made about the data may not be reliable. There are a number of ways of detecting multicollinearity in a data set. Among these are two collinearity diagnostic factors that can help to identify multicollinearity. These are, the value of the tolerance and its reciprocal, called variance inflation factor (VIF). If the value of the tolerance is less than 0.2 or 0.1 and, simultaneously, the value of VIF 10 and above, then multicollinearity is problematic. The variable’s tolerance is 1 − R2 . Generally, a small tolerance value indicates that the variable under consideration is almost a perfect linear combination of the independent variables already in the equation and that it should not be added to the model.
  • 6. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone Sesay and Koneh 597 Also, if the standard errors of the regression coefficients are large, then multicollinearity is an issue. In addition to the standard errors of the regression coefficients, this work used the tolerance statistics and the variance inflation factor (VIF) to test for multicollinearity. Interaction To test for interaction, the logistic regression analysis was carried out with an interaction term, the p-value of the regression output determined whether or not to include an interaction term in the model. A significant p-value led to the retention of the interaction term in the present model. Influential Observation and Outliers The final step was to find out if there were observations that do not fit the model well (outliers), have strange values for any variable (leverage) or that have undue influence on the model (influence). This ended the variable selection for the final model. Final Model After the variable selection stage, the next step was to fit and assess the final logistic regression model. Most of the diagnostic steps taken during the variable selection stage were again applied to the final model. This was done to ensure the appropriateness, adequacy and usefulness of the final model upon which our conclusion was based. EMPIRICAL ANALYSIS Descriptive Statistic/ Exploratory data Analysis Table 1. Descriptive Analysis: Dependent (DV) and Independent (IV) Variables to be Modeled Variable Name IV/DV Valid Range Variable Type Cassava Yield/Outcome DV High, Low Character, Categorical Educational Level IV No Formal Education, Primary School, Secondary School, Tech - Voc. Character, Categorical Gender IV Male, Female Numeric, Categorical Land Owner IV Self, Communal, Lease, Rent Character, Categorical Family Size IV 1 -17 Numeric, Categorical Farm Size IV 1-10 acres Numeric, Continuous Age IV 17-59 years Numeric, Continuous Farming Experience IV 1-29 yesrs Numeric, Continuous Source of Labour IV Family, haired, communal Numeric, Categorical Pesticides IV Yes, No Numeric, Categorical Credit Facility IV Yes, No Numeric, Categorical Extension Services IV Yes, No Numeric Categorical Descriptive Statistic For Categorical Variable Table 2:Descriptive Statistics N Range Minimum Maximum EDUCATIONAL LEVEL 200 3 0 3 FAMILY SIZE 200 16 1 17 OWNERSHIP OF THE FARM LAND 200 4 0 4 SOURCES OF LABOUR 200 2 0 2 CREDIT FACILITIES 200 1 0 1 SOURCE OF FARM POWER 200 2 0 2 PESTS AND DISEASES CONTROL 200 0 0 0 Valid N (listwise) 200 Descriptive Statistic for Continuous Variable Table 3: Descriptive Statistics N Minimum Maximum Mean Std. Deviation FARM SIZE OF THE RESPOND 200 0 10 5.29 3.086 AGE OF RESPONDENT 200 17 59 41.55 8.689 FARMING EXPERIENCE OF RESPONDENT 200 1 29 14.51 8.460 Valid N (listwise) 200
  • 7. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone J. Agric. Econ. Rural Devel. 598 Exploratory data Analysis A bivariate exploratory analysis was carried out to know if there was a relationship between the continuous independent variables and the categorical outcome variable. The independence sample t-test was used to explore the relationship between each of the continuous independent variables and the outcome variable, cassava yield. Like any statistical test, before using the independence sample t-test, the common assumptions made when doing a t-test were considered. The assumption of the t-test for independent means focuses on sampling, research design, measurement, population distributions and population variance. The t-test for independent means is considered typically robust for violations of the normal distribution assumption (with a larger sample size). This work used the QQ-plot to see if the assumption of normality was satisfied before using the t-test for independent means. Quantile-Quantile (Q-Q) plot for continuous independent variables The Q-Q plots for the continuous variables are presented in figure 1. The Q-Q plot is a graphical method for comparing two probability distributions by plotting their quantiles against each other. A concave departure from the straight line in the Q-Q plot is an indication of a heavy tailed distribution, whereas a convex departure is an indication of a thin tail. From the Q-Q plot in figure 1, it is evident that, the distributions of the continuous independent variables are not perfectly normally distributed. However, because of the central Limit Theorem (sample size is greater than 30) and the data was obtained randomly, the t-test was carried out. Figure 1: QQ-plot of continuous variables Independent Samples Test The independent sample t-test was carried out for each of the continuous independent variables, to determine if: (1) there is a statistically significant difference in the mean experience gained by cassava farmers with high cassava yield and those with low cassava yield. (2) there is a statistically significant difference in the mean farm size used by cassava farmers with high cassava yield and those with low cassava yield. (3) there is a statistically significant difference in the mean age of cassava farmers with high cassava yield and those with low cassava yield. The independent sample t-test acted as an exploratory tool. Like all exploratory analysis, the independence sample t-tests helped to determine if it is worth fitting a logistic regression model for these variables or not. A significant difference in mean, implies, running a logistic regression would be the best, as the results would be significant. Below are the outputs of the independent sample t-tests for the continuous variables used in the final model. The significance level in the independence sample t-test (in table 4) for farm size in relation to cassava yield is far below the threshold significance level of 0.05. This means that the mean difference in the farm size for those cassava farmers with high cassava yield and those with low cassava yield is statistically significant. This further implies that, there is a relationship between farm size and cassava yield. The logistic regression model was used to further explore this relationship. Similarly, the significance level in the independence sample t-test, (presented in table 6) for farming experience is far below the threshold significant level of 0.05. This means that, the mean difference in the farming experience of those cassava farmers with high yield and those with low yield is statistically significant. This further implies that, there is a relationship between farming experience and cassava yield. The logistic regression model was used to further explore this relationship However, the significance level in the independence sample t-test for the continuous variable, age is above the threshold significance level of 0.05. This means that, the difference in the mean age of those Cassava farmers with high yield and those with low yield is not statistically significant.
  • 8. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone Sesay and Koneh 599 Table 4: Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2- tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper FARM SIZE OF THE RESPOND Equal variances assumed 2.130 .146 4.738 198 .000 1.979 .418 1.155 2.803 Equal variances not assumed 4.745 188.042 .000 1.979 .417 1.156 2.802 Table 5: Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2- tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper AGE OF RESPONDENT Equal variances assumed .070 .792 1.914 198 .057 2.353 1.230 -.072 4.778 Equal variances not assumed 1.917 188.006 .057 2.353 1.228 -.069 4.775 Table 6: Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2- tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper FARMING EXPERIENCE OF RESPONDENT Equal variances assumed .456 .500 7.544 198 .000 8.033 1.065 5.933 10.133 Equal variances not assumed 7.612 192.494 .000 8.033 1.055 5.952 10.115 Variable Selection This involves two stages of analysis, the univariate stage and the multivariable stage. Univariate Analysis This is the first stage of the variable selection procedure. Each of the variables was investigated separately using univariate logistic regression. Table 7 gives a combined summary of all the univariate outputs. From table 7, all the independent variables with p-values less than the threshold value of 0.05 were found to be significant and hence associated with the dependent variable. At the second stage of the variable selection procedure, all the significant independent variables were further simultaneously investigated using the multivariable logistic regression.
  • 9. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone J. Agric. Econ. Rural Devel. 600 Table 7: P-Values and Odd Ratios of Independent Variables from Univariate Analysis Factor P-values (Wald test) P-values (LR test) Odd Ratio (OR) Age 0.010 0.009 1.673 Educational Level 0.00 0.00 2.256 Family Size 0.331 0.307 0.410 Farm Experience 0.001 0.000 0.031 Land Owner 0.474 0.474 0.923 Farm Size 0.00 0.000 0.174 Source of Labour 0.00 0.00 0.137 Pesticides 0,00 0.00 0,171 Credit Facility 0.001 0.00 0.329 Extension Services 0.193 0.191 0.680 Gender 0.078 0.078 1.680 Multivariate Analysis The multivariate output together with the goodness of fit test result for the multivariate analysis are presented in tables 8 and 9 respectively. From table 8, the Wald statistic, p-values for some of the independent variables are greater than the chosen significant threshold value of 0.05. The statistically significant independent variables base on the p-values are: farming experience, educational level, credit facility, source of labour and control means of pest and disease. This implies that some of the variables that entered the model during the multivariable analysis stage were found to be insignificant. The Hosmer- Lemeshow test of goodness of fit ( in Table 9) shows that, at this multivariate analysis stage, the model is not a good fit to the data as p=0.004<0.05. In addition, due to further statistical investigation on each of the statistically significant independent variables mentioned below (Table 8), some of them did not enter the final model. The reason being that, further statistical investigations (tests) on these variables showed that some of them influenced the outcome variable in such a way that their inclusion in the model violates the assumption of ‘no outlier. For example, when the variable, credit facility entered the model as an independent variable with extremely high significant value, the maximum of the cook’s distance exceeded one (1). It even attained the value of two (2) which is a clear violation of the assumption of ‘no outlier’ or influential observation for the validity of the result of the logistic regression model. Some of the discoveries of the statistical investigations on the independent variables are actually in line with reality. For example, very few farmers have access to credit facilities. The few that have access may tend to have big farm lands, more laborers, and improved planting materials leading to very high cassava yield/output. On the other hand, some unfaithful cassava farmers may use the money received from the credit to do something different from the cassava production for which it was obtained (credit facility’s odd Ratio <1, meaning higher credit grant decreases the odds of cassava yield). So it was not surprising to see that when credit facility entered the equation, the incidence of influential /Outlier observation was alarming. Nevertheless, we still acknowledge the fact that credit facility is an extremely high determinant of high or low level of cassava yield (outcome variable) in the study area. Table 8: Variables in the Equation B S.E. Wald df Sig. Exp(B) Step 1a FARMING_EXPERIENCE .104 .032 10.690 1 .001 1.109 AGE -.009 .026 .120 1 .729 .991 EDUCATIONAL 14.967 3 .002 EDUCATIONAL(1) -1.874 1.189 2.486 1 .115 .153 EDUCATIONAL(2) .304 1.329 .052 1 .819 1.355 EDUCATIONAL(3) -.181 1.250 .021 1 .885 .834 FARM_SIZE .127 .069 3.371 1 .066 1.136 SOURCES_OF_LABOUR 6.532 2 .038 SOURCES_OF_LABOUR(1) -1.370 .538 6.474 1 .011 .254 SOURCES_OF_LABOUR(2) -.490 .604 .658 1 .417 .612 CREDIT(1) -3.085 .661 21.803 1 .000 .046 CONTROL_MEAN(1) -1.816 .572 10.083 1 .001 .163 Constant 4.142 1.911 4.697 1 .030 62.952 a. Variable(s) entered on step 1: FARMING_EXPERIENCE, AGE, EDUCATIONAL, FARM_SIZE, SOURCES_OF_LABOUR, CREDIT, CONTROL_MEAN. Table 9: Hosmer and Lemeshow Test Step Chi-square df Sig. 1 22.670 8 .004
  • 10. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone Sesay and Koneh 601 Now that the significant independent variables in relation to the output variable are selected, the next step was to fit the final model for the logistic regression analysis. Final Model This is the last stage of the analysis. After the variables have been selected from the first two stages of the logistic regression modeling, the following analytical procedures were taken to build and confirm the final model so as to achieve our objective of identifying the main factors that influence cassava productivity and to determine the effect of each factor on cassava produced in the study area. The categorical variable coding result presented in table 10 shows that majority of cassava farmers were illiterates with no formal education. Table 10: Categorical Variables Codings Frequency Parameter coding (1) (2) (3) EDUCATIONAL LEVEL OF RESPONDENT NO FORMAL EDUCATION 119 1.000 .000 .000 PRIMARY SCHOOL 16 .000 1.000 .000 SECONDARY SCHOOL 55 .000 .000 1.000 TECH - VOC. 10 .000 .000 .000 The model coefficients are contained in the column headed B in Table 11. A negative coefficient means that the Odd of increase in cassava yield decreases. The output in Table 11 helped to identify the key determinants of increase or decrease in cassave productivity. That is, those independent variables that contributed significantly to the level of cassava yield. It also helped to determine how each determinant influenced cassava yield. From table 11, it is clear that among the independent variables that entered the final model, farm size with significance level (for Wald) that is far below the threshold significance level of 0.05 is the main factor that influenced the level of cassava yield. The odd ratio (Exp(B)) associated with farm size is 1.188 which is greater than one (>1), meaning, an increase in farm size will increase the probability of an increase in cassava yield. In other words, the probability of high cassava yield occurring with a unit (acre) increase in farm size is higher than at the original farm size. Also, from table 11, educational level is seen as a significant factor in determining the level of cassava yield. It odd ratio (Exp(B)) is less than one (<1) for all levels (no formal education, primary school, secondary school and tech voc). This means that the probability of high cassava yield with a unit increase in educational level is lower than at original (or no increase). In other words, the odds of increase in cassava yield is lower for farmers with high educational level than for those with no or low educational level. Lastly, from table 11, the interaction term, farming experience by age is a highly significant factor (with a significant level of 0.00) in determining the level of cassava yield. It odd ratio is greater than one. This implies that, the odds of an increase in cassava yield is higher for older people with more farming experience than for younger people with less farming experience. That is, the probability of an increase in cassava yield is higher with a unit (year) increase in age by farming experience than at original. Table 11: Variables in the Equation B S.E. Wald df Sig. Exp(B) 95% C.I.for EXP(B) Lower Upper Step 1a EDUCATIONAL 21.921 3 .000 EDUCATIONAL(1) -2.099 1.217 2.975 1 .085 .123 .011 1.331 EDUCATIONAL(2) -.971 1.322 .539 1 .463 .379 .028 5.051 EDUCATIONAL(3) -.085 1.263 .005 1 .946 .918 .077 10.908 FARM_SIZE .172 .060 8.357 1 .004 1.188 1.057 1.335 AGE by FARMING_EXPERIENCE .003 .001 26.598 1 .000 1.003 1.002 1.004 Constant -.809 1.275 .402 1 .526 .445 a. Variable(s) entered on step 1: EDUCATIONAL, FARM_SIZE, AGE * FARMING_EXPERIENCE. Model Checking Chi-square goodness of fit test for model coefficients The test in table 12 was used to check if the present (new) model with explanatory variables included is an improvement over the baseline model. This test uses the chi-square test to see if there is a significant difference between the Log-likelihoods of the baseline model and the present model. A significantly reduced value of the Log- likelihoods (-2LLs) suggests that the new model is explaining more of the variation in the outcome variable than the baseline model. In other words, a significantly reduced value of the Log-likelihoods shows that the new model is an improvement over the baseline model. From Table 12, the chi-square statistic is highly significant (chi- square=87.395 df=5, p<.000). This shows that, the present (new) model is significantly better compared to the baseline model.
  • 11. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone J. Agric. Econ. Rural Devel. 602 Table 12: Omnibus Tests of Model Coefficients Chi-square df Sig. Step 1 Step 87.395 5 .000 Block 87.395 5 .000 Model 87.395 5 .000 From the classification table presented in Table 13, the present logistic regression model correctly classified the outcome for 77% of the cases. Outcome Classification Table 13: Classification Tablea Observed Predicted OUTPUT OR YIELD Percentage CorrectLOW HIGH Step 1 OUTPUT OR YIELD LOW 66 22 75.0 HIGH 24 88 78.6 Overall Percentage 77.0 a. The cut value is .500 Model chi-square goodness of fit test The hypothesis tested for the model goodness of fit were stated as: 𝐻0: The model is a good fitting model. 𝐻 𝑎: The model is not a good fitting model. From table 14, the tests of goodness of fit shows that, the model is a good fit to the data as 𝑝 = 0.724 > .05 Table 14: Hosmer and Lemeshow Test Step Chi-square df Sig. 1 5.309 8 .724 Measures of the Predictive Power of the Model From the model summary result presented in table 15, it is clear that, between 35% and 47% of the variation in cassava yield was explained by the logistic regression model. Table 15: Model Summary Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square 1 186.977a .354 .474 a. Estimation terminated at iteration number 5 because parameter estimates changed by less than .001. Influential Observation and Outliers Again, it is good to find out if there are observations that do not fit the model well (outliers), have strange values (leverage) or that have undue influence on the model (influential). In this work, the cook’s distance denoted as Di, was used to find an influential predictor in the set of predictor variables used in the analysis. In other words, it was used to identify points that negatively affect the logistic regression model. The measurement is a combination of each observation’s leverage and residual values; the higher the leverage and residuals, the higher the Cook’ distance. A Di value of more than 1 indicates that an influential observation is present. The maximum and minimum values of the Cook’s Distance for our analysis are presented in the summary table (table 16) below. From table 16, the maximum value of Di is 0.40012 which is less than one (<1). Therefore, the issue of influential observation or outlier is not alarming. Table 16: Analog of Cook's influence statistics N Valid 200 Missing 0 Mode .00028 Range .40010 Minimum .00003 Maximum .40012 MODEL DISCRIMINATION How well the model distinguishes between the two groups in the binary outcome in binary logistic regression was assessed using the area under the receiver operating characteristic (ROC) curve. The Two basic measures of diagnostic accuracy are the sensitivity and specificity (Zhou et al 2002). When sensitivity is plotted against 1-specificity we obtained the receiver operating characteristic (ROC) curve. The diagonal line in the curve represents chance. The curve in figure 2 is well above the diagonal line. In addition, from table 16, the area under the curve (AUC) is 0.814. This represents a high predictive accuracy of the chosen model. In other words, an AUC value of 0.814 (which is close to 1) indicates that the model reliably distinguished between cassava farmers with high and low cassava yields. Figure 2: Receiver Operating Characteristic (ROC) curve
  • 12. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone Sesay and Koneh 603 Table: 16: Area Under the Curve Test Result Variable(s): Area .814 Multicolinearity As already measured under the methodology section, among the number of ways of detecting multicilinearity in a data set, this work used the value of the tolerance and its reciprocal, called the variance inflation factor (VIF) to detect or identify multicollinearity in the data. The variable’s tolerance is 1 − 𝑅2 . If the value of tolerance is less than 0.2 or 0.1 and, simultaneously, the value of VIF 10 and above, then multicollinearity is problematic. From our analysis, the highest value of, 𝑅2 which is the Negelkerke, 𝑅2 is equal to 0.474. Hence the tolerance is calculated as 1 − 𝑅2 = 1 − 0.474 = 0.526 and it VIF is 2.1097 (𝑖. 𝑒. 𝑉𝐼𝐹 = 1 0.474 = 2.1097) . The tolerance is far above 0.1 and the value of VIF is far below 10. It is therefore concluded that multicolinearity is not problematic. In addition, the standard errors of the coefficients are not too significant. This further suggested that multicolinearity is not an issue here. RESULTS AND DISCUSSION A logistic regression analysis was carried out to find out the main factors influencing cassava productivity in the Moyamba District, southern province of Sierra Leone. The level of cassava productivity was measured by the level (high or low) of cassava yield. At the initial stage of the analysis, many factors were considered as potential determinants of cassava productivity in the study area. However, further statistical investigation proved that some of those factors were not significant determinants of a high or low yield of cassava. Insignificant factors were dropped out of the analysis. Variables (factors) that entered the final model are: farm size, educational level and the interaction term, age by farming experience. At the final stage of the analysis, the logistic regression model was significant, as the test of the full model against a model with only the constant was significant. This shows that the predictors as a set reliably distinguished between a high and low yield of cassava (chi square = 87.395, p < .05 with df=5). The model explained between 35% and 47% (Negelkerke R2 and Cox and snail R2 respectively) of the variation in the cassava yield. The Wald criteria showed that, among the independent variables that entered the final model, farm size with a significance level (for Wald) that is far below the threshold significance level of 0.05 was the main factor that influenced the level of cassava yield. The odd ratio (Exp(B)) associated with farm size is 1.188 which is greater than one (>1), meaning that, an increase in farm size will increase the probability of high cassava yield. In other words, the probability of high cassava yield occurring with unit (acre) increase in farm size will be higher than at the original farm size. This result is in conformity with the research result documented by Ren et al (2019), that farm size plays a critical role in agricultural sustainability. Educational level was also shown to be a significant factor in determining the level (high or low) of cassava yield. However, in line with the view of mejority Sierra Leoneans, that, subsistence farming is an option for those who failed to go to school or droped out uf school, its odd ratio (Exp(B)) is 0.123 which is less than one (<1). This means that the probability of high cassava yield with unit increase in educational level is lower than at original (no increase). In other words, the odds of increase in cassava yield is lower for higher educational level. This result is similar to that obtained by Malte Reimers and Stephan Klasen (2013) who detected insignificant or even surprisingly negative effects of schooling on agricultural productivity Finally, the interaction term, farming experience by age is a high] y significant factor (with a significant level of 0.00) in determining the level of cassava yield. Its odd ratio is greater than one. This implies that the Odds of increase in cassava yield are higher for older people with more farming experience than for younger people with less farming experience. In other word, the probability of an increase in cassava yield is higher with a unit (year) increase in age by experience. This is not surprising as extension services for disseminating information on farm technologies are not common in the rural areas. Farmers only gain experience after long years of farming. A study conducted by Gideon Danso-Abbeam et al (2018) reaffirmed the critical role of extension programmes in enhancing farm productivity and household income. Credit facility, though, did not enter the final model (as it exhibited an extreme behavior), was still recognized as a significant determinant of high level of cassava yield. This is because, the Wald p-value associated with credit facility was significant at both the univariate and multivariable stages of the variable selection in the logistic regression modeling. This result is supported by Ekwere et al (2014), in their book title, “Effects of agricultural credit facility on the agricultural production and rural development, In their book, they documented that, the independent variables; loan size, farm size, and inputs explained the variation in the total value of farmers output. CONCLUSION The purpose of this work was to identify the main factors influencing cassava productivity and to determine the effect of each factor on cassava yield/output. The empirical evidence showed that, farm size, educational level, and age by farming experience are the main factors influencing cassava productivity in the study area. An
  • 13. A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone J. Agric. Econ. Rural Devel. 604 increase in farm size can increase cassava yield whiles an increase in educational level can decrease cassava yield. In fact, most of the cassava farmers were illiterates with no formal education. Older people with more farming experience contributed significantly to cassava production in the study area. . REFERENCES Ekwere et al 2014, Effects of agricultural credit facility on the agricultural production and rural development, International Journal of the Environment Vol.3 (2) 192- 204 Essers MA, DE Vrics-Smits LM, Barker N et all. (2005), functional interaction between beta catenin and FOXO in oxidative stress signalling. Science 308; 1181- 1184 FAO, (2003a). The state of food insecurity in the World: monitoring progress towards the food summit and millennium development goals: Rome, Italy: pp. 24-26. FAO. (2003b). Cassava production data (2002). (http:/www.fao.org). FAO (2004). Proposals for a definition and methods of analysis for dietary fibre content. CX/NFSDU 04/3 Add 1. Codex Committee on Nutrition and Foods for Special Dietary Uses. Codex Alimentarius Commission. Gideon Danso-Abbeam, Dennis Sedem Ehiakpor and Robert Aidoo (2018), Agricultural extension and its effects on farm productivity and income: insight from Northern Ghana, Agriculture & Food Security https://doi.org/10.1186/s40066-018-0225- Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression. New York: Wiley. Malte Reimers and Stephan Klasen (2013), Revisiting the Role of Education for Agricultural Productivity, American Journal of Agricultural Economics, vol. 95, issue 1, 131-152 Peduzzi P., Concato J., Kemper E, Holford T R, Feinstein A.R. (1996). Simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology. Ren, Chenchen, Liu, Shen et al (2019), The impact of farm size on agricultural sustainability, Journal of Cleaner Production vol. 22 Sanni, L.O., Onadipe, P. Ilona, M.D. Mussagy, A. Abass, and A.G.O. Dixon, (2009). Successes and challenges of cassava enterprises in West Africa: a case study of Nigeria, Bénin,and Sierra Leone. IITA, Ibadan, Nigeria. 19 pp Statistics Sierra Leone, (2015). Population and Housing Census. Zhou, X. H., Obuchowski, N. A., and Obushcowski, D. M. (2002). Statistical methods in diagnostic medicine. Wiley & Sons: New York. Accepted 29 August 2019 Citation: Sesay RB, Koneh A (2019). A Logistic Regression Model to Identify Factors Influencing Cassava Productivity in the Southern Part of Sierra Leone. Journal of Agricultural Economics and Rural Development, 5(2): 592-604. Copyright: © 2019: Sesay and Koneh. This is an open- access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.