SlideShare a Scribd company logo
1 of 29
Download to read offline
Electronic copy available at: http://ssrn.com/abstract=2214543
1
A Sales Forecasting Model Based on Internal Organizational Variables
Jose M. Pinheiro
Abstract
In this paper we develop a sales forecasting model for a small sized business unit
focused on exports. Through a choice of internal explanatory variables in the
organization we develop an econometric sales forecasting method, and compare its
outputs with simpler univariate forecasting techniques in use at the organization. We
show that the econometric technique produces a better fit to observable data, allowing
for sensible statistical inference, and adds explanatory features not accessible to the
simpler extrapolation techniques, by integrating quantitative variables accounting for
relevant management decisions.
Key words: Sales Forecasting, Econometric Sales Forecasting, Time Series Models, OLS Regression,
,Extrapolation Sales Forecasting.
1. Introduction
In this paper, we present, analyze and discuss a sales forecasting econometric model,
discuss its capabilities, and compare it to extrapolation procedures in use on an exports
business unit of a Portuguese firm of fast moving consumer goods (FMCG).
The peculiarity of the econometric model presented is its exclusive focus on observable
explanatory variables which are internal to the organization. This feature could facilitate
its test and eventual implementation in different organizations sharing a few
characteristic features with the organization upon which we base the model, since no
great effort or costs would be necessary to gather the kind of data it requires.
The availability of the internal data necessary to build the explanatory variables,
together with the works of (Rumelt, 1991), which have concluded that the external
market and industry characteristics where less determinant to a firm performance than
the organization itself, is what led us to the econometric models presented in this paper.
Similarly, we show that our econometric model can reasonably explain sales (exports)
that would not be as easily explained had we chosen for external economic variables
such as annual rates of growth of Portuguese exports during the period chosen, or even
Europe’s GDP annual growth rates (European countries have been the destiny of more
than 70% of the exports produced by the business).
In fact, if we look at the exports data of this business unit (2007-2012), we see a
continuous growth with no parallel in either the annual growth rates of Portuguese
Electronic copy available at: http://ssrn.com/abstract=2214543
2
exports, or the GDP annual growth rates of most destination countries, both negatively
affected in the aftermath of the U.S. sub-prime crisis.
Since its set up year – 2006 – the business unit has present annual growth rates of sales
of 90% (2007), 113% (2008), 15% (2009), 17% (2010), 34% (2011) and 33% (2012
est.). These growth rates clearly contrast with Portuguese negative growth rates of
exports of goods and the steep reduction in imported goods in most economies in 2008
and 2009 (Banco de Portugal, 2011).
We believe the model class we discuss could be easily deployed to use into other SMEs
to their management advantage, allowing for the reinforcement of simpler forecasting
techniques, better sales objectives setting, informed direct decision-making regarding
variables such as pricing, client portfolio diversification and sales force size, for
example, as well as indirect decision-making depending on sales, such as inventory,
production planning, and other decisions.
This paper set up is the following: In section 2, we discuss the literature framework of
forecasting in what it relates to econometric models and its comparison to univariate
extrapolation methods as well as judgmental forecasting techniques.
In section 3 we present the model, as well as the data sources used and its
characteristics. We present the model estimations, some diagnostic tests results, and
make the comparison with the simpler techniques in use at the business unit under
focus.
In section 4, we present and discuss a battery of tests performed to the model and make
some inferences discussing its predictions accuracy, and in Section 5 we conclude, with
the opportunities and limitations inherent to this paper’s work.
Section 6 is the Technical Annex of the paper.
2. Literature
Although Econometrics has significantly developed throughout the 20th
century, adding
“empirical content to economic theory and allowing theories to be tested and used for
forecasting and policy evaluation” (Geweke, et al., 2008), still relatively few
Portuguese firms discuss, or use, econometric forecasting methods, especially if their
organization is of small size.
This may be so due to lack of specialized competences, lack of resources, organizational
culture limitations, modeling difficulties, or simply lack of knowledge about the power
of modern econometrics on forecasting.
Despite this perceived situation, many marketing managers throughout the world value
sales forecasting techniques.
In fact, sales are the main variable upon which budgets in all organizational areas are
developed, determining the sustainable components of annual marketing investments,
Electronic copy available at: http://ssrn.com/abstract=2214543
3
staff recruitment, and other types of investment needed for organizational development,
training or maintenance.
In a survey directed at marketing managers conducted as far back as 1975, 93% have
claimed sales forecasting to be “one of the most critical” or “a very important aspect of
their company’s success” (Dalrymple, 1975).
The same author concluded that “formal marketing plans are often supported by
forecasts” (Dalrymple, 1987).
However, astonishingly, most of marketing literature does not approach sales
forecasting (Armstrong, et al., 1987). Even to this day, is not easy to access public
domain research on sales forecasting, one of the reasons being its private and classified
nature.
Sales forecasting methodologies can be based on judgmental sources or statistical
sources, and sometimes use both types of sources (Armstrong & Collopy, 1998).
While judgmental sources are of qualitative nature, statistical sources can be either
univariate or multivariate. Univariate statistical sources used in forecasting models use
extrapolation methods based on quantitative analogies and rule-based forecasting,
mainly. Extrapolation uses historical data, and often exponential smoothing, which
attributes more weight to recent data, while rule-based forecasting accounts for
judgmental knowledge of the factors influencing sales forecasts.
On the other hand, multivariate statistical sources are based on data and rely on
econometric models to produce forecasts.
Some researchers have concluded that relatively simple extrapolation methods can
perform as well as more complex methods (Makridakis, 1984), (Armstrong, 1985),
which is not surprising if we assume that simpler extrapolation methods may at times
integrate in an heuristic manner management information about the object of forecast,
which is not always the case with econometric models data.
In fact, it is not simple for time series models, for example, to integrate a manager’s
knowledge of context variables hard or impossible to quantify, but nevertheless
potentially influencing to a high degree the series in the future. Some quantitative
methods display the limitation of intrinsically assuming that the causal forces that
influence an historical data series will propagate in the same manner into the future.
Therefore, judgmental extrapolations may be more effective than quantitative
extrapolations whenever there are known and anticipated sales changes, when there is
relevant knowledge that can easily be integrated into judgmental extrapolations, or
when there are reasonably clear expectations about the way in which several context
factors may affect sales development (for example, political decisions affecting
aggregate demand, top management decisions expected to be taken within the
4
forecasting horizon and affecting sales performance, industrial and technological
changes, and other types of change that managers know to affect sales).
Some econometric models can incorporate, to some extent, decision making and
judgmental planning, for example taking into explicit account planned marketing
variables, into the forecasting horizon. These models tend to assume a high degree of
complexity and require considerable efforts in data gathering, and are not at reach for
many firms.
Judgmental and univariate extrapolations for sales forecasting are a manager’s obvious
choice when substantial amounts of sales data are not available. Judgmental
extrapolations rely primarily on good knowledge of context, including product demand,
markets, future plans and other factors. Sales extrapolations can also work well if a
regular sales behavior is achieved.
(Mentzer & Kahn, 1995) argue that extrapolation of the historical sales trend is common
in firms.
Other authors presented evidence that simple extrapolation-like forecasts were often
amongst the most accurate procedures (Schnaars, 1984). Even when there is more
uncertainty, conservative criteria may be adopted such as staying close to the historical
average, which helps dampen the trend as the horizon increases (Gardner & McKenzie,
1985).
However, according to (Baker, 1999), econometric methods are more useful, when:
i. Strong causal relationships with sales are expected;
ii. These causal relationships can be estimated;
iii. Large changes are expected to occur in the causal variables over the forecast
horizon, and,
iv. The changes in the causal variables can be forecasted or controlled, especially
with the respect to their direction.
They further argue that in the absence of any of the above conditions, econometric
models “should not be expected to improve accuracy.”
Comparing different forecasting models and techniques is important, since one often
gains in considering more than one method of forecast (Baker, 1999).
Furthermore, (Baker, 1999) further states that the selection of forecasting methods can
be avoided by combining forecasts, and that fit measures such as R2
are of secondary
importance vis-a-vis realistic simulations of the actual situation of a forecaster, e.g. ex
ante forecasts can be more important than fit measures.
Other researchers have suggested that when well-structured domain knowledge is
lacking, equally-weighted averages are as accurate a scheme as any other (Clemen,
1989). Another example of this was provided by (Blattberg & Hoch, 1990), which have
5
reached superior sales forecasts by performing equally weighted averages of judgmental
forecasts and quantitative models.
There is therefore a literature arrow pointing towards the comparison and combined
used of forecasting techniques in order to achieve more balanced, accurate, forecasts.
3. Econometric Model
3.1 Data Set
The data used consists of a set of sixty seven observations for four variables in an
exports business unit of a Portuguese firm. The data is composed of monthly
observations, ranging from February 2007 to August 2012.
Currently the business unit handles clients in more than 60 different world markets,
developing 67% of its exports to European countries, 13% to Asian countries, 12% to
African countries, 7% to American countries and 1% to Oceania. The business unit sells
branded disposable paper products, a product category commonly considered to
combine a high elasticity of demand (due to the availability of substitute goods in most
markets and the typical retail prices low amplitude) with a low geographical range for
exports (due to the high impact of transport costs on its pricing structure).
Data gathered relates directly with both the business unit modus operandi and focus,
and the rationale of the model.
Management has trained the sales team to operate in a focused manner, transversal as
well as deep and ambitious:
 To penetrate as many markets as possible so as to maximize business
opportunities and minimize sales vulnerability over time;
 To prioritize “optimal” commercial operations combining better chances of
providing sustainable sales and business development opportunities; this policy
locus is aligned with a variable monthly compensation scheme requiring sales
effectiveness;
 To use pricing as deemed necessary to achieve penetration, without incurring
market risk or losses;
 To foster team work as much as role specialization to the maximum possible
extent in order to maximize productivity and know-how sharing among team
members;
 A special focus on establishing and fostering indirect business structures in
foreign countries has been key to the continuous development of a global and
expanding network of agents, distributors, brokers and other types of business
relations.
The data gathered has 67 monthly observations for each of the following variables:
6
 Sales data of the business unit: “Sales”, the dependent variable (in Euro).
 Average prices of sales per unit mass: “Price”, an independent variable (in
Euro/Kg).
 Number of people on the sales team: “Sales Force”, an independent variable.
 Monthly number of sales operations concluded: “Portfolio”, an independent
variable.
It is possible to look at the all the data gathered in a combined way using a multiple
display graphic with left and right scales:
It is also possible to inspect to which degree the variables chosen are correlated. The
correlation matrix for the variables chosen is:
Correlation coefficients, using the observations 2007:02 - 2012:08
5% critical value (two-tailed) = 0.2404 for n = 67
Sales Portfolio Price Salesforce
1.0000 0.6276 -0.4621 0.6315 Sales
1.0000 -0.3714 0.5390 Portfolio
1.0000 -0.3315 Price
1.0000 Salesforce
Correlation between variables only indicates their relation magnitude and relation sign,
and does not mean causality. The Pearson product-momentum correlation coefficient
0
5
10
15
20
25
30
35
40
2007 2008 2009 2010 2011 2012
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
Sales (right)
Portfolio (left)
Price (left)
Salesforce (left)
7
between two variables and can be written as , where
is the covariance between For each sample set of two discrete
variables with variations in discrete t moments ranging from 1 to T, we have
∑ ̅ ̅
√∑ ̅ √∑ ̅
.
3.2 Economic and Managerial Rationale of the Explanatory Variables
3.2.1 Pricing
Our choice of data and variables has a simple rationale, in some cases deeply supported
by existing literature.
Alfred Marshall has probably been the first to define the elasticity of demand, through
which a mathematical framework has been provided for the understanding of the
possible variations of demand with prices (Marshall, 1890).
In his own words: “And we may say generally: - the elasticity (or responsiveness) of
demand in a market is great or small according as the amount demanded increases
much or little for a given fall in price, and diminishes much or little for a given rise in
price”.
And, more specifically: "the only universal law as to a person's desire for a commodity
is that it diminishes... but this diminution may be slow or rapid. If it is slow... a small
fall in price will cause a comparatively large increase in his purchases. But if it is
rapid, a small fall in price will cause only a very small increase in his purchases. In the
former case... the elasticity of his wants, we may say, is great. In the latter case... the
elasticity of his demand is small."
Marshall defined this elasticity using differential calculus. Generally, elasticity can vary
depending on the goods and related markets concerned. The availability of substitute
goods tends to increase elasticity (say, the case for disposable paper products), while the
specificity of some goods tends to lower it (food and energy, but also branded items for
the case of fast moving consumer goods). The higher the disposable income percentage
needed to buy a good, the higher the elasticity tends to be, given the purchase decision
will tend to be much less discretionary and much more cautious.
The products handled by the business unit have a twofold characteristic: on one hand,
they are usually considered to be easily replaceable, as there are in most markets, even
if emergent, ready available substitute products; on the other hand, a brand factor has
been introduced and is changing the nature of the goods, a process leading to easier
exports and sales to retailers seeking to differentiate their product assortments in their
home markets, even when located at large distances from the producer’s facilities.
8
The econometric model developed will allow us to assess the average elasticity of
demand of the underlying products in the set of export markets implicitly included in
the sample data, for the considered period in the analysis.
3.2.2 Sales Force Sizing
According to (Lilien, et al., 1992), “The size of the sales force is one of the most
important decisions facing executives in many industries” and “The specific options
chosen – sales force size versus the use of wholesalers, distributors, agents, and so
fourth – depend on the relative costs and the selling tasks required”.
The sales force sizing decision often uses heuristics like the breakdown – percentage of
sales – approach, the workload approach or an industry guideline method (Lilien, et al.,
1992).
Such heuristics are used in the business unit under study; more precisely a workload
approach is regularly used when assessing the need to resize the business unit sales
force.
The sales force size can also be viewed as a proxy for a group of variables including
external aggregate demand and sales force experience.
As such, workload is therefore not only a function of the processes involved in the
generation and processing of the sales per se, but also a function of experience, training,
team work, and personal sales effectiveness of each salesman in the team.
Given the small size of the business unit’s sales force, workload assessments result from
a direct management follow up of the variations and perceived complexity of processes
under development at a given time as well as historically. Such an assessment is
performed periodically.
Our sample contains several variations on the sales force size, which either reflects
salesmen turnover rates or, more commonly, assessments and consequent resizing of the
sales force.
In the business unit focused, the sales force size is judged to be of critical importance,
given the high workloads and longer times typically associated with the development of
export operations, usually more complex than sales directed at domestic markets.
3.2.3 Portfolio Sizing
Perhaps less portrayed in marketing science literature, client portfolio size is an
important variable to maximize sales growth, business opportunities and,
simultaneously, reduce a business unit sales vulnerability, and cash flows volatility,
over time.
(Crina, et al., 2011) have developed an interesting research, using financial portfolio
theory to reorganize client portfolios towards profit maximization and cash flows
9
volatility minimization, providing guidelines for incorporating a risk overlay into
established costumer management frameworks.
The example provided by the aforementioned authors establishes a parallel between the
role of portfolio diversification in common business portfolios and the role of the well-
known diversification of stock market portfolios, with the same objectives of
minimization of risk and maximization of returns.
With a similar concept in mind, the business unit management has provided moral
incentives for its sales force to secure ever more medium and large clients and business
prospects over time, almost abolishing geographical discrimination criteria.
These objectives have led to a continuous business seeking attitude, across ever more
international markets, as well as to expanding businesses within each national market.
As the group attitude set in, this process resulted in a slow, but steady, growing client
portfolio trend, and a growing number of medium and large clients within the portfolio,
over time.
This strategy should at least partly explain why the business unit sales records do not
show any particularly strong downturn in the aftermath of the sub-prime crisis, unlike
many other export business units.
3.3 Ex Ante Modeling Hypothesis
In view of the literature support and managerial rationale presented on the precedent
sections, we hypothesize the following relations:
I. “Price” should have a negative relation on “Sales”;
II. “Sales Force” [Size] should have a positive relation on “Sales”, and
III. “Portfolio” [Size] should have a positive relation on “Sales”.
As such, regardless the econometric model, the estimated coefficient sign for Price
should be negative, and the sign for the estimated Sales Force coefficient should be
positive, as well as the sign for the estimated Portfolio coefficient.
3.4 Econometric Model
3.4.1 Model Specification and Estimation
According to (Gujarati & Porter, 2009), “broadly speaking, there are five approaches to
economic forecasting based on time series data: (1) exponential smoothing methods, (2)
single-equation regression models, (3) simultaneous-equation regression models, (4)
autoregressive integrated moving average models, and (5) vector auto regression
models.“
We have a multiple regression model, specified by:
10
with t =1…T, T=67, and the stochastic disturbance term associated with
observation t.
Alinearization can be written as:
where,
It is worthy to mention that the linearized data presents a much smoother evolution in
time, as depicted in the graphic below:
The implicit assumptions for the use of Ordinary Least Squares method in the multiple
regression model are (Gujarati & Porter, 2009):
i) Its linearity in the parameters.
ii)
iii) |
iv) |
v) ( | )
vi) The number of observations T must be greater than the number of
parameters.
vii) There must be variation in the values of all independent variables.
0
2
4
6
8
10
12
14
2007 2008 2009 2010 2011 2012
l_Sales
l_Portfolio
l_Price
l_Salesforce
11
viii) There must not be any exact correlation between the independent variables.
ix) The model is correctly specified.
Compliance with the above assumptions is observed and further discussed in Sections 4
and 6. The estimation of the model (Table.1) using Ordinary Least Squares, shows that
all coefficients are significant to less than a 1% significance level, and coefficient signs
are as expected.
The estimated model equation can be written as:
̂ ̂
with t=1…T, T=67, .
Table.1 Model 1: OLS, using observations 2007:02-2012:08 (T = 67)
Dependent variable: l_Sales
Coefficient Std. Error t-ratio p-value
Sig.
Constant 9.72829 0.646982 15.04 1.62e-022 ***
l_Portfolio 0.681882 0.197953 3.445 0.0010 ***
l_Price -0.688531 0.192897 -3.569 0.0007 ***
l_Salesforce 0.60029 0.224509 2.674 0.0095 ***
Mean dependent var 11.92724 S.D. dependent var 0.669145
Sum squared resid 12.43029 S.E. of regression 0.444192
R-squared 0.579374 Adjusted R-squared 0.559344
F(3, 63) 28.92555 P-value(F) 7.01e-12
Log-likelihood -38.63623 Akaike criterion 85.27247
Schwarz criterion 94.09124 Hannan-Quinn 88.76208
Rho 0.068852 Durbin-Watson 1.817990
What the Ordinary Least Square (OLS) method does is to minimize the sum of squared
vertical distances between the observed responses in the data set and the responses
predicted by the linear approximation. This process estimates coefficients (“slopes”)
relating the explanatory variables to the dependent variable but provides no evidence of
causality.
The OLS estimator is consistent when the regressors are exogenous (e.g. unexplained
by the model) and there is no perfect multi collinearity (perfect linear relations among
one or more variables), and optimal in the class of linear unbiased estimators when the
errors are homoscedastic (e.g. display constant variance) and serially uncorrelated (e.g.
without underlying patterns between variables and across time).
Under these conditions, the method of OLS provides minimum-variance, mean-
unbiased estimation when the errors have finite variances. Under the additional
assumption that the errors are normally distributed (which they are, as shown in Section
6), the OLS is the maximum likelihood estimator (e.g. the best fit to data).
12
From Table.1 we can see the of the regression is of 0.579, meaning the model can
explain almost 58% of the variance observed in the sample. This is not an excellent
result, but considering we are using only organizational variables easy to gather on any
firm, it is a pretty reasonable result. The can be written as , where is
the square residuals with respect to the regression, and is the square of the
residuals with respect to the average. The adjusted or ̅
̅
̅
̅ is related to the through
the relation ̅
̅
̅̅ , where T is the sample size and n the number of
regressors.
Looking at Table.1 above, one can find in the t-ratios a measure of individual
coefficient significance. Whenever the for a given coefficient presents a value
higher than the for populations with the same characteristics as the sample (in the
case for a right tail probability of 2.5%), the null
hypothesis (stating that the said coefficient is zero) is rejected. The t-stat for a
coefficient can be expressed through
̂
√ ̂
. Thus for the coefficient
, for example, we have a 95% probability of it being between
, e.g. between 0.287 and 1.077.
In the same way, we can present the 95% confidence intervals for the values of all the
coefficients in this estimation. Please note none of the intervals contains the value zero,
which could render a coefficient to be non-significant.
Variable Coefficient 95% Confidence
Int. (Min.)
95% Conf. Int.
(Max.)
Constant 9,72829 8,43540 11,0212
Ln(Portfolio) 0,681882 0,286304 1,07746
Ln(Price) -0,688531 -1,07401 -0,303057
Ln(Salesforce) 0,600290 0,151645 1,04893
The for a given coefficient (see Table.1) indicates the probability of
obtaining the same estimated value for that coefficient by chance, assuming it would not
be significant (e.g. not reasonably explanatory).
The in the table is also of importance. It provides a measure for the global
significance of the regression. We have that .
The statistics result is basically the ratio between the explained variance and the
unexplained variance of the regression (it can be expressed also by
). A high ratio thus means a significant and good fit to data. The
corresponding we get from Table.1 is also much smaller than 0.05. The null
hypothesis is thus rejected – it stated that all coefficients were zero or that the dependent
variable could only be interpreted as a stochastic variation not explained by the
variation of the independent variables - and we may conclude the coefficients are
13
significant and different from zero. The low associated with the F-stat of the
regression indicates the probability of finding a regression resulting in these values by
chance, assuming non-significant coefficients.
The coefficients on the log-log model can be interpreted as elasticities (on ceteris
paribus circumstances, e.g. varying one coefficient at a time while keeping the others
constant). As such, and looking at the estimated model equation, we know that for every
1% increase in Price we are very likely to witness a 0.689% decrease in Sales. The price
elasticity, which can be generally written as , where the second fraction if
the derivative of the demand curve with respect to price, is the slope of the demand
curve with respect to price. When its zero, the demand is said to be perfectly inelastic –
the demand is the same, no matter the price; when the elasticity presents values between
-1 and 0 (as in our regression), the demand is said to be relatively inelastic (not much
sensible to price, especially if the value is closer to zero); when the elasticity value is -1,
the demand is said to be unit elastic, e.g. a symmetric, of opposite sense, variation of
demand to price; finally, when the elasticity is lower than -1, the demand is said to be
relatively elastic or elastic, and we enter the zone in the curve where demand is highly
sensitive to price variations.
The interpretation of the other elasticities is less straightforward but keeps having
managerial sense: a 1% increase in client Portfolio is estimated to have a result of
0,682% increase in Sales, and a 1% increase in Sales force is estimated to have a
0,600% increase in Sales. Therefore, if the Portfolio is around 30 and the management
wants it to be around 40 in the short term, a 25% increase, that would likely result in
about 17% increase in Sales (0,25 x 0,682). In the same way, if the team is increased to
5 people from 4 – again a 25% increase – Sales would be expected to rise in the short
term by about 15% (0,25 x 0,60).
A battery of tests has been performed onto the regression (discussed in more detail in
Section 4 with detailed estimations presented in Section 6), showing that:
 The residual follows a normal distribution;
 There is no sign of perfect collinearity among any independent variables;
 The RESET test indicates an adequate specification;
 The White’s test for heteroscedasticity shows no heteroscedasticity traces;
 The Durbin-Watson test indicates no autocorrelation in the error terms;
 The Breusch-Godfrey test indicates no autocorrelation in the error terms, and,
 The Chow test with a structural break at observation 34 indicates no structural
break presence.
The results seem to indicate that this model and resulting regression can be used for
adequate statistical inference purposes and statistical predictions. Below, the estimated
versus observed values for the dependent variable, within a 95% confidence interval.
14
3.4.2 Extrapolations
The business unit under focus uses simple univariate regressions embedded in Excel.
The methods used are linear, exponential and polynomial regressions of monthly Sales.
For comparison purposes, the three methods of regression applied to the same series of
Sales (67 observations, ranging from February 2007 to August 2012) result in:
̂
̂
̂
All these models are easier to compute than the previous econometric model, but they
do not integrate any explanatory variables, and therefore do not allow for any inference
and forecasting accounting for variation scenarios in any of the key explanatory factors
we considered before, and that we have shown to be statistically significant to explain
the dependent variable (sales).
Although practical and straightforward to use for managerial purposes, we believe, and
our previous model discussion clearly shows, that these simpler regressions could be
9.5
10
10.5
11
11.5
12
12.5
13
13.5
14
2007 2008 2009 2010 2011 2012
95 percent interval
l_Sales
forecast
15
easily replaced or complemented by the model portrayed in the previous sub-section,
with important advantages for management purposes.
4. Tests and Inference
In this section we test for possible infractions to the basic assumptions allowing the use
of Ordinary Least Squares regression and the interpretation of data as a time series.
4.1 Adequate Specification test
We use the Ramsey “RESET” test. The Ramsey test is a general specification test for
the linear regression model. More specifically, it tests whether non-linear combinations
of the fitted values help explain the response variable.
The intuition behind the test is that if non-linear combinations of the explanatory
variables have any power in explaining the dependent variable, the model is not
adequately specified.
The null hypothesis is that of an adequate specification. We have proceeded to a
regression including the squares and the cubes of the explanatory variables (see Section
6). The statistic used can be expressed by ~ , where k is the
number of new regressors, g the total number of coefficients and N the total number of
observations.
We have obtained a test statistic, F(2, 61) = 0.137224, with p-value = P(F(2, 61) >
0.137224) = 0.872044.
At a significance level of 5%, the value of is larger than the test
statistic, the p-value is above 0.05, and therefore we cannot reject the null hypothesis for
correct specification.
4.2 Multi collinearity
A multiple regression model with correlated explanatory variables can indicate how
well the entire bundle of independent variables predicts the outcome variable, but it may
not give valid results about any individual variable, or about which variables are
redundant with respect to others (e.g. linearly dependent of other variables).
We use a formal detection-tolerance or the variance inflation factor (VIF) for multi
collinearity. Tolerance can be defined as , where is the coefficient of
determination of a regression of variable j on all the other variables (not including the
dependent variable).
The variance inflation factor (VIF) is defined as A tolerance of less than 0.20
or 0.10 and a VIF of 5 or 10 or above indicates multi collinearity. The VIF factor
reflects all other factors that influence the uncertainty in the coefficient estimates. A
16
high VIF value, or a low tolerance value, indicates the degree to which the variance of
the underlying regression is inflated, by the effect of multicollinearity (the standard
errors of the estimated coefficients are inflated when multi collinearity is present.
In our model and regression, the VIF factors of each variable are all between 1.3 and
1.5, indicating no multi collinearity:
 l_Portfolio 1.487
 l_Price 1.350
 l_Salesforce 1.468
4.3 Heteroscedasticity
It is important to test for the constancy of the variance of the variables along the sample.
A constant variance is designated by homoscedasticity, while a variable variance is
known by heteroscedasticity.
Heteroscedasticity, when present, means the estimation regressors are not efficient. The
regression residuals against the fitted dependent variable are depicted below:
The tests we have used to evaluate the presence of heteroscedastic residuals were the
White test and also the Breusch-Pagan test (regression of the square residuals on the
independent variables, with the null hypothesis of homoscedasticity).
-1.5
-1
-0.5
0
0.5
1
10.5 11 11.5 12 12.5 13
l_Sales
Regression residuals (= observed - fitted l_Sales)
17
The White test for constant variance is performed through an auxiliary regression, that
of the squared residuals from the original regression in function of the original
variables, their cross products and their square products.
The resulting times the sample size is the Lagrange multiplier, which follows a Chi-
squared distribution, with a number of degrees of freedom equal to the number of
estimated parameters in the auxiliary regression.
The unadjusted of that auxiliary regression is 0.11, while the test statistic is of 7.69
and the Chi-squared(9) of 16.92 for a 5% significance level. The null hypothesis of the
test is homoscedasticity (or no heteroscedasticity).
Unadjusted R-squared = 0.114886
Test statistic: TR^2 = 7.697350, with p-value = P(Chi-square(9) > 7.697350) =
0.564910.
Since the Chi-Square (9) is 16.919 at 5% significance, thus above the test statistic of
7.697, and the p-value is high, we cannot reject the null hypothesis (heteroscedasticity
not present).
The White’s test using only squares, and the Breusch-Godfrey test present similar
results:
 White's test for heteroscedasticity (squares only): null hypothesis:
heteroscedasticity not present. Test statistic: LM = 6.96137, with p-value =
P(Chi-square(6) > 6.96137) = 0.324434.
 Breusch-Pagan test for heteroscedasticity. Null hypothesis: heteroscedasticity
not present. Test statistic: LM = 4.80767, with p-value = P(Chi-square(3) >
4.80767) = 0.186435.
4.4 Normality of the residuals
As shown in more detail on Section 6, the expected value of the random disturbance
term is zero | , fitting a normal distribution well:
4.5 Autocorrelation
Autocorrelation of the residuals is often a problem in time series models. It can be
interpreted as the cross-correlation of a signal with itself, or the similarity between
observations across time, a sort of repeated pattern buried into the data.
Autocorrelation of the errors can generally be detected because it produces
autocorrelation in the observable residuals or error terms. Autocorrelation violates the
ordinary least squares (OLS) assumption that the error terms are uncorrelated. While it
does not bias the OLS coefficient estimates, the standard errors tend to be
18
underestimated (and the t-scores overestimated) when the autocorrelations of the errors
at low lags are positive.
The Breusch-Godfrey test for correlation up to the 12th
power of the residuals, presents
a very high p-value, with a critical value for of 1.947 (at 5% significance)
higher than the test statistic LMF of 0.1359. The null hypothesis cannot be rejected (no
autocorrelation).
The Durbin-Watson statistic was also used, defined as
∑
∑
where T is the
number of observations and the residual. The value of d varies between 0 and 4.
The Durbin-Watson statistic for and presents a and a
, while the regression Durbin-Watson statistic has a value of 1.8623. This
is a value higher than and lower than 4- (2.3012), indicating no
statistical evidence that the error terms are negatively or positively correlated (e.g. the
disturbance terms are independent).
Further estimations are presented on Section 6.
4.6 Model Stability
Chow’s test was used to evaluate whether the coefficients of two model regressions
using the data with a break at observation 34 are identical or not. The null hypothesis
states that the coefficients are the same (no structural break).
As further shown in Section 6, we have not found a structural break, and the dynamics
of the samples seems to be coherent for the whole time interval taken for the regression.
4.7 Inference and Predictions
Since our model complies with the OLS base assumptions – see also Section 6 - we can
safely use statistical inference to draw a few conclusions.
The 99% confidence ellipse for the Ln(Price) and Ln(Portfolio), the 99% confidence
ellipse for Ln(Salesforce) and Ln(Price), and the 99% confidence ellipse for
Ln(Portfolio) and Ln(Salesforce), are depicted in the figures below, as well as the
observed minus fitted values for the dependent variable, e.g. the regression residuals.
19
Predictions can be made making the regression of the model on sub-section 3.4.1 for a
part of the sample. We did it from February 2007 to December 2011, and then used the
resulting equation of regression to predict sales from January 2012 to August 2012.
Please note that since this particular regression is made over a sub-sample, the
coefficients of regression are centered over slightly different values:
t(55, 0.025) = 2.004
Variable Coefficient 95 Confidence Interval
const 9.77450 (8.37470, 11.1743)
l_Portfolio 0.674835 (0.250883, 1.09879)
l_Price -0.675421 (-1.08791, -0.262931)
l_Salesforce 0.546571 (0.00497455, 1.08817)
The predictions for the months of 2012 (January to August) are presented below:
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4
l_Portfolio
99% confidence ellipse and 99% marginal intervals
0.682, -0.689
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
-1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0
l_Price
99% confidence ellipse and 99% marginal intervals
-0.689, 0.6
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 0.2 0.4 0.6 0.8 1 1.2 1.4
l_Portfolio
99% confidence ellipse and 99% marginal intervals
0.682, 0.6
-1.5
-1
-0.5
0
0.5
1
2007 2008 2009 2010 2011 2012
Regression residuals (= observed - fitted l_Sales)
11
11.5
12
12.5
13
13.5
14
2010 2010.5 2011 2011.5 2012 2012.5
l_Sales
forecast
95 percent interval
20
Forecasting for future moments can be performed by setting appropriate levels for
variables under the control of management or the sales team – in equation of the model
estimation presented on sub-section 3.4.1 - such as sales force size and price, and
introducing objective levels for portfolio sizing, for example. Forecasting scenarios can
be built fixing the values of two independent variables and varying the third (e.g. ceteris
paribus) and looking the output result for the dependent variable.
4.8 Dynamic Models
To search for the short and long term elasticities of the variables integrating the model
developed we have made a regression including the lagged term of the dependent
variable (Partial Adjustment Model).
However, statistical significance for the lagged term was not found (small t-ratio, high
p-value) which seems to indicate that the static model is better.
A linear restriction hypothesizing the lagged term to be zero returned a high p-value,
and therefore we cannot exclude the null hypothesis, e.g. the lagged term coefficient to
likely be equal to zero.
OLS, using observations 2007:03-2012:08 (T = 66)
Dependent variable: l_Sales
Coefficient Std. Error t-ratio p-value
Const 8.89535 1.33457 6.6653 <0.00001 ***
l_Portfolio 0.605383 0.200736 3.0158 0.00373 ***
l_Price -0.667782 0.205363 -3.2517 0.00187 ***
l_Salesforce 0.508548 0.234869 2.1652 0.03429 **
l_Sales_1 0.0968393 0.109802 0.8819 0.38127
Mean dependent var 11.94211 S.D. dependent var 0.663020
Sum squared resid 11.71626 S.E. of regression 0.438258
R-squared 0.589964 Adjusted R-squared 0.563076
F(4, 61) 21.94182 P-value(F) 2.95e-11
Log-likelihood -36.60360 Akaike criterion 83.20720
Schwarz criterion 94.15547 Hannan-Quinn 87.53338
rho -0.039909 Durbin's h -0.691776
Further estimation of the dependent variable on all the independent variables, their
lagged terms and the lag term of the dependent variable (Autoregressive Distributed Lag
Model), has not shown to present meaningful significance for most coefficients.
OLS, using observations 2007:03-2012:08 (T = 66)
Dependent variable: l_Sales
Coefficient Std. Error t-ratio p-value
Const 9.21283 1.53228 6.0125 <0.00001 ***
l_Portfolio 0.59641 0.221189 2.6964 0.00916 ***
21
l_Portfolio_1 0.0522092 0.243777 0.2142 0.83117
l_Price -0.645508 0.22092 -2.9219 0.00495 ***
l_Price_1 -0.0467767 0.223199 -0.2096 0.83473
l_Salesforce -0.217245 0.71804 -0.3026 0.76331
l_Salesforce_1 0.780354 0.733461 1.0639 0.29177
l_Sales_1 0.056814 0.128607 0.4418 0.66030
Mean dependent var 11.94211 S.D. dependent var 0.663020
Sum squared resid 11.48317 S.E. of regression 0.444956
R-squared 0.598121 Adjusted R-squared 0.549618
F(7, 58) 12.33172 P-value(F) 1.52e-09
Log-likelihood -35.94047 Akaike criterion 87.88094
Schwarz criterion 105.3982 Hannan-Quinn 94.80283
Rho -0.032556 Durbin-Watson 2.053078
Furthermore, the p-values of all linear restriction tests hypothesizing each lagged term
to be zero have returned values well above 0.05, which means we cannot reject the null
hypothesis for each, e.g. no lagged term is relevant per se. A joint partial significance
test for all lagged terms coefficients to be zero also returns a much higher than 0.05 p-
value, indicating such coefficients do not have a joint partial significance. The null
hypothesis proposing all coefficients of the lagged terms to be zero simultaneously
cannot be rejected.
This indicates that the static model continues to seem better than this dynamic variant.
5. Conclusions
We have discussed an econometric model that reasonably explains the sales behavior of
an exports business unit of a Portuguese firm and shown such a model to be of
straightforward implementation and based on accessible data gathering.
Well specified econometric models provide important measures of confidence for the
regression coefficients, allowing for more informed forecasting and decision making.
The model developed could easily replace or complement currently in use univariate
regressions with advantages to the confidence levels of forecasts and the understanding
of the explanatory role of significant variables on sales.
In fact, the econometric model also fits the data better than any of the univariate
regressions currently used at the firm, and can be used to build forecasting scenarios
with more probable accuracy varying for the independent variables on a ceteris paribus
base and observing the outcome on the dependent variable.
This study contains limitations as well as opportunities for further research. An obvious
limitation could be the exclusion of macroeconomic variables that could help explain
demand, and therefore the dependent variable, to a higher degree. However, such a
consideration should be more important for business units focusing on homogeneous
22
markets (national markets), since under such circumstances demand should be more
closely linked to a GDP per capita, average per capita consumption of the specific
goods sold, or both. Competitors pricing could be added to the model but it is not easy
to include for such a variable, since the data is not available (our price is the price to
intermediaries per unit mass, not a retail price). Promotional and advertising
expenditures could also be added (however, in our case, such expenses are mainly an
investment of international distributors, and therefore a variable which is both harder to
obtain and outside the control of the business unit management).
Another limitation, since the model developed can only explain about 55% of the
dependent variable behavior, lies in its relative usefulness to predict the future of distant
periods, as is almost always the case with forecasting techniques.
We can only recommend forecasts for more six to twelve months into the future, due to
the complex dynamics inherent to sales and the resulting uncertainty pending over any
sales forecasting model.
On the other hand, an opportunity for further research could be attempting to further
generalize the econometric model developed, in order to enable it to forecast the sales of
different SMEs operating in the FMCG sector (Fast Moving Consumer Goods), and in
particular Portuguese FMCG SMEs, which more or less share quite a number of
structural characteristics. We also believe the model could be generalized, or easily
modified, to explain the sales of exports units in other sectors as well.
Another possible opportunity for further research lies in the study of price elasticity
over time and its explanatory power, or use, to brand valuation purposes.
It is interesting to note that our log-log model provides a measure of the combined
elasticity for the products sold to several international markets. Overall, the coefficient
for price – e.g. the price elasticity - seems to be leaning towards a relatively inelastic
demand, which is surprising and very interesting considering the goods sold are easy to
replace. This seems to be consistent with the brand factor introduced in the goods and
allowing the business unit to perform exports to faraway markets (a case study of its
own in its business sector).
6. Annex
All model estimations used Gretl 1.9.10 (5.11.2012 version). The univariate regressions
were an output of Excel.
6.1 Assumptions for using the OLS with a Time Series Interpretation
We can immediately state that the form of the model presented in Section 3 is linear
(after applying the natural logarithmic function), the number of observations (67) is
greater than the number of parameters (4), and there is variation in the values of all
independent variables. The other assumptions need specific testing shown in throughout
this section and identified below:
23
i) Its linearity in the parameters.  Form of the model in the ind. variables.
ii)  Hausman test.
iii) |  Conditional normality of residuals.
iv) |  Tests for heteroscedasticity.
v) ( | )
Test for autocorrelation.
vi) The number of observations T must be greater than the number of
parameters.  T=67, 4 parameters (including the constant).
vii) There must be variation in the values of all independent variables.  Data.
viii) No exact correlation between the independent variables.  Multi
collinearity test.
ix) The model is correctly specified.  RESET test.
6.1 Normality of the Random Disturbance Term
We also know that the expected value of the random disturbance term is zero
| (see plots below), and that it fits a normal distribution well:
Frequency distribution for uhat1, obs 1-67, number of bins = 9, mean = 5.03743e-016,
sd = 0.444192.
interval midpt frequency rel. cum.
< -1.1659 -1.2969 2 2.99% 2.99% *
-1.1659 - -0.90393 -1.0349 0 0.00% 2.99%
-0.90393 - -0.64197 -0.77295 3 4.48% 7.46% *
-0.64197 - -0.38000 -0.51099 6 8.96% 16.42% ***
-0.38000 - -0.11804 -0.24902 12 17.91% 34.33% ******
-0.11804 - 0.14393 0.012943 20 29.85% 64.18% **********
0.14393 - 0.40589 0.27491 13 19.40% 83.58% ******
0.40589 - 0.66785 0.53687 7 10.45% 94.03% ***
>= 0.66785 0.79884 4 5.97% 100.00% **
Test for null hypothesis of normal distribution: Chi-square(2) = 5.662 with p-value
0.05894.
0
0.2
0.4
0.6
0.8
1
1.2
-1.5 -1 -0.5 0 0.5 1
uhat1
uhat1
N(5.0374e-016,0.44419)
Test statistic for normality:
Chi-square(2) = 5.662 [0.0589]
-1.5
-1
-0.5
0
0.5
1
1.5
-1.5 -1 -0.5 0 0.5 1 1.5
Normal quantiles
Q-Q plot for uhat1
y = x
24
The Chi-Square for 2 degrees of freedom and a significance level of 5%, has a critical
value of 5.99146, above 5.662. The p-value is also above 0.05. The null hypothesis
cannot be rejected.
6.2 Heteroscedasticity
White’s test for heteroscedasticity shows that the variance of the disturbance term is
basically constant.
White's test for heteroscedasticity with null hypothesis: heteroscedasticity not present.
Test statistic: LM = 7.69735 with p-value = P(Chi-square(9) > 7.69735) = 0.56491.
White's test for heteroscedasticity, OLS, using observations 1-67, dependent variable:
uhat^2.
coefficient std. error t-ratio p-value
const -4.80831 3.75831 -1.279 0.2059
l_Portfolio 2.65727 1.84708 1.439 0.1557
l_Price 1.57591 1.81795 0.8669 0.3897
l_Salesforce 1.18288 4.07945 0.2900 0.7729
sq_l_Portfolio -0.460071 0.294185 -1.564 0.1234
X2_X3 -0.223413 0.455339 -0.4907 0.6256
X2_X4 0.308763 0.945163 0.3267 0.7451
sq_l_Price -0.311194 0.284098 -1.095 0.2780
X3_X4 -0.347550 1.22089 -0.2847 0.7769
sq_l_Salesforce -1.05806 1.11451 -0.9494 0.3465
Unadjusted R-squared = 0.114886
Test statistic: TR^2 = 7.697350, with p-value = P(Chi-square(9) > 7.697350) =
0.564910.
Since the Chi-Square (9) is 16.919 at 5% significance, thus above the test statistic of
7.697, the p-value is high and we cannot reject the null hypothesis (heteroscedasticity
not present).
6.3 Multi collinearity
We can also state that there are no traces of perfect collinearity among the independent
variables. This has been assessed through the Variance Inflation Factors calculus, which
provides an index measuring how much variance of an estimated regression coefficient
increases because of collinearity. No coefficient presents a VIF above 10.
Variance Inflation Factors
Minimum possible value = 1.0
Values > 10.0 may indicate a collinearity problem
25
l_Portfolio 1.487
l_Price 1.350
l_Salesforce 1.468
VIF(j) = 1/(1 - R(j)^2), where R(j) is the multiple correlation coefficient between
variable j and the other independent variables.
6.4 Autocorrelation
There is no autocorrelation of the disturbance term as the Breusch-Godfrey test shows
below.
Breusch-Godfrey test for autocorrelation up to order 12, OLS, using observations,
2007:02-2012:08 (T = 67), dependent variable: uhat.
coefficient std. error t-ratio p-value
const -0.216662 0.802325 -0.2700 0.7882
l_Portfolio 0.0735869 0.253416 0.2904 0.7727
l_Price 0.0180643 0.228217 0.07915 0.9372
l_Salesforce -0.0311841 0.267454 -0.1166 0.9076
uhat_1 0.0798734 0.141615 0.5640 0.5752
uhat_2 -0.107188 0.141800 -0.7559 0.4532
uhat_3 0.0342331 0.143921 0.2379 0.8129
uhat_4 0.0513473 0.143277 0.3584 0.7215
uhat_5 -0.0578235 0.147794 -0.3912 0.6972
uhat_6 -0.00449540 0.142725 -0.03150 0.9750
uhat_7 -0.0525654 0.145586 -0.3611 0.7195
uhat_8 -0.000852480 0.144410 -0.005903 0.9953
uhat_9 0.0149031 0.144067 0.1034 0.9180
uhat_10 0.0599730 0.146162 0.4103 0.6833
uhat_11 0.0252232 0.149431 0.1688 0.8666
uhat_12 -0.0493138 0.155429 -0.3173 0.7523
Unadjusted R-squared = 0.030998
LM test for autocorrelation up to order 12 - Null hypothesis: no autocorrelation - Test
statistic: LMF = 0.135955, with p-value = P(F(12,51) > 0.135955) = 0.999695.
The critical value for (at a 5% significance level) is far higher
than 0.135955, the p-value is close to 1. The null hypothesis of no autocorrelation
cannot be rejected.
Other alternative statistics chosen in this test context also result in high p-values (no
autocorrelation):
 Test statistic: LMF = 0.135955, with p-value = P(F(12,51) > 0.135955) = 1;
26
 Alternative statistic: TR^2 = 2.076849, with p-value = P(Chi-square(12) >
2.07685) = 0.999; and,
 Ljung-Box Q' = 2.15796, with p-value = P(Chi-square(12) > 2.15796) = 0.999.
Finally, the Durbin-Watson statistic for and presents a and a
, while the regression Durbin-Watson statistic has a value of 1,8623. This
is a value higher than and lower than 4- (2,3012), confirming no
autocorrelation in the error terms.
6.5 Model Specification
The model is well specified as the RESET test shows below.
RESET test for specification - Null hypothesis: specification is adequate - Test
statistic: F(2, 61) = 0.137224, with p-value = P(F(2, 61) > 0.137224) = 0.872044.
At a significance level of 5%, the value of is larger than the test
statistic, the p-value is above 0.05, and we cannot reject the null hypothesis for correct
specification.
The auxiliary regression for the RESET specification test is:
Auxiliary regression for RESET specification test, OLS, using observations 1-67,
dependent variable: l_Sales.
coefficient std. error t-ratio p-value
const -147.736 683.209 -0.2162 0.8295
l_Portfolio -18.2850 79.6310 -0.2296 0.8192
l_Price 18.4218 80.3907 0.2292 0.8195
l_Salesforce -16.2096 70.0333 -0.2315 0.8177
yhat^2 2.27325 10.0222 0.2268 0.8213
yhat^3 -0.0616774 0.286432 -0.2153 0.8302
Test statistic: F = 0.137224, with p-value = P(F(2,61) > 0.137224) = 0.872.
The same test types but only with squares and only with cubes also return a good model
specification.
 RESET test for specification (squares only), Test statistic: F = 0.231643, with p-
value = P(F(1,62) > 0.231643) = 0.632; and,
 RESET test for specification (cubes only), Test statistic: F = 0.226464, with p-
value = P(F(1,62) > 0.226464) = 0.636.
6.6 Structural Break Test (Stability)
Chow’s test shows no structural breaks in the model when a test for a break at
observation 34 is introduced.
Chow test for structural break at observation 34 - Null hypothesis: no structural break
27
Test statistic: F(3, 60) = 0.434218, with p-value = P(F(3, 60) > 0.434218) = 0.729285.
At a significance level of 5%, the value of is greater than the test
statistic, the p-value is therefore way above 0.05 and we cannot reject the null
hypothesis for no structural break in the model.
6.7 Residual Analysis
The plots of the residuals in function of each variable are presented below.
The residuals of actual versus predicted sales (in logs) are plotted below:
-1.5
-1
-0.5
0
0.5
1
10.5 11 11.5 12 12.5 13
l_Sales
Regression residuals (= observed - fitted l_Sales)
-1.5
-1
-0.5
0
0.5
1
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6
l_Portfolio
Regression residuals (= observed - fitted l_Sales)
-1.5
-1
-0.5
0
0.5
1
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
l_Price
Regression residuals (= observed - fitted l_Sales)
-1.5
-1
-0.5
0
0.5
1
0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
l_Salesforce
Regression residuals (= observed - fitted l_Sales)
10
10.5
11
11.5
12
12.5
13
10.5 11 11.5 12 12.5
predicted l_Sales
actual = predicted
28
6.8 Normality of the Dependent Variable
The normality of the dependent variable can be visually assessed by the means of a Q-Q
plot or a density of distribution. This property is shown as complementary to the base
data as it is not a base assumption for OLS correct application on regressions.
Frequency distribution for l_Sales, obs 1-67, number of bins = 9, mean = 11.9272, sd =
0.669145.
interval midpt frequency rel. cum.
< 10.326 10.151 3 4.48% 4.48% *
10.326 - 10.677 10.502 0 0.00% 4.48%
10.677 - 11.027 10.852 2 2.99% 7.46% *
11.027 - 11.377 11.202 8 11.94% 19.40% ****
11.377 - 11.727 11.552 13 19.40% 38.81% ******
11.727 - 12.077 11.902 14 20.90% 59.70% *******
12.077 - 12.428 12.253 5 7.46% 67.16% **
12.428 - 12.778 12.603 18 26.87% 94.03% *********
>= 12.778 12.953 4 5.97% 100.00% **
Test for null hypothesis of normal distribution: Chi-square(2) = 5.250 with p-value
0.07245. The Chi-Square for 2 degrees of freedom and a significance level of 5%, has a
critical value of 5.991, above 5.250. The p-value is also above 0.05. The null hypothesis
cannot be rejected (normal distribution).
7. References
Armstrong, J., 1985. Long-Range Forecasting: From Crystal Ball to Computer. New York: John Wiley.
Armstrong, J., Brodie, R. & McIntyre, S., 1987. Forecasting methods for marketing. International
Journal of Forecasting, Volume 3, pp. 355-76.
Armstrong, J. & Collopy, F., 1998. Integration of statistical methods and judgment for time series
forecasting: principles from emprirical research., s.l.: s.n.
10
10.5
11
11.5
12
12.5
13
13.5
14
10 10.5 11 11.5 12 12.5 13 13.5 14
Normal quantiles
Q-Q plot for l_Sales
y = x
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
10 10.5 11 11.5 12 12.5 13 13.5 14
l_Sales
l_Sales
N(11.927,0.66915)
Test statistic for normality:
Chi-square(2) = 5.250 [0.0724]
29
Baker, M., 1999. Sales Forecasting. In: IEBM Encyclopedia of Marketing. s.l.:International Thompson
Business Press, pp. 278-290.
Banco de Portugal, 2011. Aeconomia portuguesa em 2011. Lisboa: Departamento de Estudos
Económicos.
Blattberg, R. & Hoch, S., 1990. Database models and managerial intuition: 50 per cent model + 50 per
cent manager. Management Science, Volume 36, pp. 887-99.
Clemen, R., 1989. Combining forecasts: A review and annotated bibliography. International Journal of
Forecasting, Volume 5, pp. 559-83.
Crina, O., Bolton, R., Michael, D. & Walker, B., 2011. Balancing Risk and Return in a Costumer
Portfolio. Journal of Marketing, Volume 75, pp. 1-17.
Dalrymple, D., 1975. Sales forecasting: methods and accuracy.. Business Horizons, pp. 69-73.
Dalrymple, D., 1987. Sales forecasting practices: Results from a U.S. survey.. International Journal of
Forecasting, Volume 3, pp. 379-91.
Gardner, E. & McKenzie, E., 1985. Forecasting trends in time series. Management Science, Volume 31,
pp. 1237-46.
Geweke, J., Horowitz, J. & Pesaran, H., 2008. Econometrics, The New Palgrave Dictionary of
Economics. s.l.:Palgrave Macmillan.
Gujarati, D. & Porter, D., 2009. Basic Econometrics. 5th ed. s.l.:McGraw-Hill/Irwin.
Lilien, G., Kotler, P. & Moorthy, K., 1992. Marketing Models. s.l.:Prentice Hall, Inc.
Makridakis, S., 1984. The Forecasting Accuracy of Major Time Series Methods. New York: John Wiley.
Marshall, A., 1890. The Principles of Economics. London: Macmillan and Co. Ltd.
Mentzer, J. & Kahn, K., 1995. Forecasting technique familiarity, satisfaction, usage, and application.
Journal of Forecasting, Volume 14, pp. 465-76.
Rumelt, R., 1991. How much does industry matter? Strategic Management Journal, 12(3), pp. 167-181.
Schnaars, S., 1984. Situational factors affecting forecasting accuracy. Journal of Marketing Research,
Volume 21, pp. 290-297.
The author would like to thank Professor Elias Soukiazis, of Coimbra Faculty of Economics, for
having reviewed this paper and for proposed additional diagnostic tests to the models
presented.

More Related Content

Similar to A Sales Forecasting Model Based on Internal Organizational Variables.pdf

Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...
Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...
Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...AIRCC Publishing Corporation
 
Econometrics Explained - IPA Report
Econometrics Explained - IPA ReportEconometrics Explained - IPA Report
Econometrics Explained - IPA ReportThink Ethnic
 
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...ijaia
 
Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx
Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docxStatistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx
Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docxdessiechisomjj4
 
Introduction to demand forecasting
Introduction to demand forecastingIntroduction to demand forecasting
Introduction to demand forecastingAmandaBvera
 
Making Analytics Actionable for Financial Institutions (Part II of III)
Making Analytics Actionable for Financial Institutions (Part II of III)Making Analytics Actionable for Financial Institutions (Part II of III)
Making Analytics Actionable for Financial Institutions (Part II of III)Cognizant
 
Art of Analytics
Art of AnalyticsArt of Analytics
Art of AnalyticsFessal R
 
Marketing analytics for the Banking Industry
Marketing analytics for the Banking IndustryMarketing analytics for the Banking Industry
Marketing analytics for the Banking IndustrySashindar Rajasekaran
 
BUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdf
BUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdfBUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdf
BUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdfAdheer A. Goyal
 
DATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docx
DATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docxDATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docx
DATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docxsimonithomas47935
 
Marketing and HR Analytics
Marketing and HR AnalyticsMarketing and HR Analytics
Marketing and HR AnalyticsVadivelM9
 
The measurement debate is alive
The measurement debate is aliveThe measurement debate is alive
The measurement debate is aliveDev Sharma
 

Similar to A Sales Forecasting Model Based on Internal Organizational Variables.pdf (20)

Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...
Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...
Application of Facebook's Prophet Algorithm for Successful Sales Forecasting ...
 
Econometrics Explained - IPA Report
Econometrics Explained - IPA ReportEconometrics Explained - IPA Report
Econometrics Explained - IPA Report
 
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...
AUTOMATION OF BEST-FIT MODEL SELECTION USING A BAG OF MACHINE LEARNING LIBRAR...
 
Agnes Jumah, Marketing Metrics
Agnes Jumah, Marketing MetricsAgnes Jumah, Marketing Metrics
Agnes Jumah, Marketing Metrics
 
Demand forecasting
Demand forecastingDemand forecasting
Demand forecasting
 
Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx
Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docxStatistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx
Statistics, Data Analysis, and Decision ModelingFOURTH EDITION.docx
 
Introduction to demand forecasting
Introduction to demand forecastingIntroduction to demand forecasting
Introduction to demand forecasting
 
Making Analytics Actionable for Financial Institutions (Part II of III)
Making Analytics Actionable for Financial Institutions (Part II of III)Making Analytics Actionable for Financial Institutions (Part II of III)
Making Analytics Actionable for Financial Institutions (Part II of III)
 
Art of Analytics
Art of AnalyticsArt of Analytics
Art of Analytics
 
Marketing analytics for the Banking Industry
Marketing analytics for the Banking IndustryMarketing analytics for the Banking Industry
Marketing analytics for the Banking Industry
 
Forecasting in OPM.pptx
Forecasting in OPM.pptxForecasting in OPM.pptx
Forecasting in OPM.pptx
 
Market2
Market2Market2
Market2
 
BUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdf
BUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdfBUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdf
BUSINESS ANALYTICS, BACKBONE OF ORGANIZATIONS - A LITERATURE REVIEW.pdf
 
DATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docx
DATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docxDATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docx
DATA OUTPUT.docxGRAPHS FOR QUESTION ONE.docx
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
UNIT - II.pptx
UNIT - II.pptxUNIT - II.pptx
UNIT - II.pptx
 
Unit 3.pptx
Unit 3.pptxUnit 3.pptx
Unit 3.pptx
 
Marketing and HR Analytics
Marketing and HR AnalyticsMarketing and HR Analytics
Marketing and HR Analytics
 
The measurement debate is alive
The measurement debate is aliveThe measurement debate is alive
The measurement debate is alive
 
Sales forecasting
Sales forecastingSales forecasting
Sales forecasting
 

More from Anna Landers

Dinosaur Stationery Free Printable - Printable Templ
Dinosaur Stationery Free Printable - Printable TemplDinosaur Stationery Free Printable - Printable Templ
Dinosaur Stationery Free Printable - Printable TemplAnna Landers
 
How To Write Journal Paper In Latex - Amos Writing
How To Write Journal Paper In Latex - Amos WritingHow To Write Journal Paper In Latex - Amos Writing
How To Write Journal Paper In Latex - Amos WritingAnna Landers
 
Paying Someone To Write Papers
Paying Someone To Write PapersPaying Someone To Write Papers
Paying Someone To Write PapersAnna Landers
 
010 Essay Example Personal About Yourself Examples
010 Essay Example Personal About Yourself Examples010 Essay Example Personal About Yourself Examples
010 Essay Example Personal About Yourself ExamplesAnna Landers
 
The Importance Of College Education - Peachy Essay
The Importance Of College Education - Peachy EssayThe Importance Of College Education - Peachy Essay
The Importance Of College Education - Peachy EssayAnna Landers
 
Sample Scholarship Essay
Sample Scholarship EssaySample Scholarship Essay
Sample Scholarship EssayAnna Landers
 
Introduction - How To Write An Essay - LibGuides At Univers
Introduction - How To Write An Essay - LibGuides At UniversIntroduction - How To Write An Essay - LibGuides At Univers
Introduction - How To Write An Essay - LibGuides At UniversAnna Landers
 
Professionalism Medicine Essay
Professionalism Medicine EssayProfessionalism Medicine Essay
Professionalism Medicine EssayAnna Landers
 
Scarecrow PRINTABLE Stationery Paper Etsy
Scarecrow PRINTABLE Stationery Paper EtsyScarecrow PRINTABLE Stationery Paper Etsy
Scarecrow PRINTABLE Stationery Paper EtsyAnna Landers
 
Vintage Handwriting Digital Paper Textures
Vintage Handwriting Digital Paper TexturesVintage Handwriting Digital Paper Textures
Vintage Handwriting Digital Paper TexturesAnna Landers
 
Legit Essay Writing Services Scholarship - HelpToStudy.Com
Legit Essay Writing Services Scholarship - HelpToStudy.ComLegit Essay Writing Services Scholarship - HelpToStudy.Com
Legit Essay Writing Services Scholarship - HelpToStudy.ComAnna Landers
 
Writing Papers For Money - College
Writing Papers For Money - CollegeWriting Papers For Money - College
Writing Papers For Money - CollegeAnna Landers
 
Paperback Writer The Beatles Bible
Paperback Writer The Beatles BiblePaperback Writer The Beatles Bible
Paperback Writer The Beatles BibleAnna Landers
 
How Does Custom Essay Help Servic
How Does Custom Essay Help ServicHow Does Custom Essay Help Servic
How Does Custom Essay Help ServicAnna Landers
 
003 Compare And Contrast Essay Examples College E
003 Compare And Contrast Essay Examples College E003 Compare And Contrast Essay Examples College E
003 Compare And Contrast Essay Examples College EAnna Landers
 
Dos And Don Ts Of Essay Writing. 15 Essay Writing D
Dos And Don Ts Of Essay Writing. 15 Essay Writing DDos And Don Ts Of Essay Writing. 15 Essay Writing D
Dos And Don Ts Of Essay Writing. 15 Essay Writing DAnna Landers
 
Korean Stationery Gift Envelope Finely Flower An
Korean Stationery Gift Envelope Finely Flower AnKorean Stationery Gift Envelope Finely Flower An
Korean Stationery Gift Envelope Finely Flower AnAnna Landers
 
How To Format A Paper In Mla. MLA Format For Essa
How To Format A Paper In Mla. MLA Format For EssaHow To Format A Paper In Mla. MLA Format For Essa
How To Format A Paper In Mla. MLA Format For EssaAnna Landers
 
Basics Of How To Write A Speech
Basics Of How To Write A SpeechBasics Of How To Write A Speech
Basics Of How To Write A SpeechAnna Landers
 
Website That Write Essays For You To Be - There Is A Web
Website That Write Essays For You To Be - There Is A WebWebsite That Write Essays For You To Be - There Is A Web
Website That Write Essays For You To Be - There Is A WebAnna Landers
 

More from Anna Landers (20)

Dinosaur Stationery Free Printable - Printable Templ
Dinosaur Stationery Free Printable - Printable TemplDinosaur Stationery Free Printable - Printable Templ
Dinosaur Stationery Free Printable - Printable Templ
 
How To Write Journal Paper In Latex - Amos Writing
How To Write Journal Paper In Latex - Amos WritingHow To Write Journal Paper In Latex - Amos Writing
How To Write Journal Paper In Latex - Amos Writing
 
Paying Someone To Write Papers
Paying Someone To Write PapersPaying Someone To Write Papers
Paying Someone To Write Papers
 
010 Essay Example Personal About Yourself Examples
010 Essay Example Personal About Yourself Examples010 Essay Example Personal About Yourself Examples
010 Essay Example Personal About Yourself Examples
 
The Importance Of College Education - Peachy Essay
The Importance Of College Education - Peachy EssayThe Importance Of College Education - Peachy Essay
The Importance Of College Education - Peachy Essay
 
Sample Scholarship Essay
Sample Scholarship EssaySample Scholarship Essay
Sample Scholarship Essay
 
Introduction - How To Write An Essay - LibGuides At Univers
Introduction - How To Write An Essay - LibGuides At UniversIntroduction - How To Write An Essay - LibGuides At Univers
Introduction - How To Write An Essay - LibGuides At Univers
 
Professionalism Medicine Essay
Professionalism Medicine EssayProfessionalism Medicine Essay
Professionalism Medicine Essay
 
Scarecrow PRINTABLE Stationery Paper Etsy
Scarecrow PRINTABLE Stationery Paper EtsyScarecrow PRINTABLE Stationery Paper Etsy
Scarecrow PRINTABLE Stationery Paper Etsy
 
Vintage Handwriting Digital Paper Textures
Vintage Handwriting Digital Paper TexturesVintage Handwriting Digital Paper Textures
Vintage Handwriting Digital Paper Textures
 
Legit Essay Writing Services Scholarship - HelpToStudy.Com
Legit Essay Writing Services Scholarship - HelpToStudy.ComLegit Essay Writing Services Scholarship - HelpToStudy.Com
Legit Essay Writing Services Scholarship - HelpToStudy.Com
 
Writing Papers For Money - College
Writing Papers For Money - CollegeWriting Papers For Money - College
Writing Papers For Money - College
 
Paperback Writer The Beatles Bible
Paperback Writer The Beatles BiblePaperback Writer The Beatles Bible
Paperback Writer The Beatles Bible
 
How Does Custom Essay Help Servic
How Does Custom Essay Help ServicHow Does Custom Essay Help Servic
How Does Custom Essay Help Servic
 
003 Compare And Contrast Essay Examples College E
003 Compare And Contrast Essay Examples College E003 Compare And Contrast Essay Examples College E
003 Compare And Contrast Essay Examples College E
 
Dos And Don Ts Of Essay Writing. 15 Essay Writing D
Dos And Don Ts Of Essay Writing. 15 Essay Writing DDos And Don Ts Of Essay Writing. 15 Essay Writing D
Dos And Don Ts Of Essay Writing. 15 Essay Writing D
 
Korean Stationery Gift Envelope Finely Flower An
Korean Stationery Gift Envelope Finely Flower AnKorean Stationery Gift Envelope Finely Flower An
Korean Stationery Gift Envelope Finely Flower An
 
How To Format A Paper In Mla. MLA Format For Essa
How To Format A Paper In Mla. MLA Format For EssaHow To Format A Paper In Mla. MLA Format For Essa
How To Format A Paper In Mla. MLA Format For Essa
 
Basics Of How To Write A Speech
Basics Of How To Write A SpeechBasics Of How To Write A Speech
Basics Of How To Write A Speech
 
Website That Write Essays For You To Be - There Is A Web
Website That Write Essays For You To Be - There Is A WebWebsite That Write Essays For You To Be - There Is A Web
Website That Write Essays For You To Be - There Is A Web
 

Recently uploaded

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

A Sales Forecasting Model Based on Internal Organizational Variables.pdf

  • 1. Electronic copy available at: http://ssrn.com/abstract=2214543 1 A Sales Forecasting Model Based on Internal Organizational Variables Jose M. Pinheiro Abstract In this paper we develop a sales forecasting model for a small sized business unit focused on exports. Through a choice of internal explanatory variables in the organization we develop an econometric sales forecasting method, and compare its outputs with simpler univariate forecasting techniques in use at the organization. We show that the econometric technique produces a better fit to observable data, allowing for sensible statistical inference, and adds explanatory features not accessible to the simpler extrapolation techniques, by integrating quantitative variables accounting for relevant management decisions. Key words: Sales Forecasting, Econometric Sales Forecasting, Time Series Models, OLS Regression, ,Extrapolation Sales Forecasting. 1. Introduction In this paper, we present, analyze and discuss a sales forecasting econometric model, discuss its capabilities, and compare it to extrapolation procedures in use on an exports business unit of a Portuguese firm of fast moving consumer goods (FMCG). The peculiarity of the econometric model presented is its exclusive focus on observable explanatory variables which are internal to the organization. This feature could facilitate its test and eventual implementation in different organizations sharing a few characteristic features with the organization upon which we base the model, since no great effort or costs would be necessary to gather the kind of data it requires. The availability of the internal data necessary to build the explanatory variables, together with the works of (Rumelt, 1991), which have concluded that the external market and industry characteristics where less determinant to a firm performance than the organization itself, is what led us to the econometric models presented in this paper. Similarly, we show that our econometric model can reasonably explain sales (exports) that would not be as easily explained had we chosen for external economic variables such as annual rates of growth of Portuguese exports during the period chosen, or even Europe’s GDP annual growth rates (European countries have been the destiny of more than 70% of the exports produced by the business). In fact, if we look at the exports data of this business unit (2007-2012), we see a continuous growth with no parallel in either the annual growth rates of Portuguese
  • 2. Electronic copy available at: http://ssrn.com/abstract=2214543 2 exports, or the GDP annual growth rates of most destination countries, both negatively affected in the aftermath of the U.S. sub-prime crisis. Since its set up year – 2006 – the business unit has present annual growth rates of sales of 90% (2007), 113% (2008), 15% (2009), 17% (2010), 34% (2011) and 33% (2012 est.). These growth rates clearly contrast with Portuguese negative growth rates of exports of goods and the steep reduction in imported goods in most economies in 2008 and 2009 (Banco de Portugal, 2011). We believe the model class we discuss could be easily deployed to use into other SMEs to their management advantage, allowing for the reinforcement of simpler forecasting techniques, better sales objectives setting, informed direct decision-making regarding variables such as pricing, client portfolio diversification and sales force size, for example, as well as indirect decision-making depending on sales, such as inventory, production planning, and other decisions. This paper set up is the following: In section 2, we discuss the literature framework of forecasting in what it relates to econometric models and its comparison to univariate extrapolation methods as well as judgmental forecasting techniques. In section 3 we present the model, as well as the data sources used and its characteristics. We present the model estimations, some diagnostic tests results, and make the comparison with the simpler techniques in use at the business unit under focus. In section 4, we present and discuss a battery of tests performed to the model and make some inferences discussing its predictions accuracy, and in Section 5 we conclude, with the opportunities and limitations inherent to this paper’s work. Section 6 is the Technical Annex of the paper. 2. Literature Although Econometrics has significantly developed throughout the 20th century, adding “empirical content to economic theory and allowing theories to be tested and used for forecasting and policy evaluation” (Geweke, et al., 2008), still relatively few Portuguese firms discuss, or use, econometric forecasting methods, especially if their organization is of small size. This may be so due to lack of specialized competences, lack of resources, organizational culture limitations, modeling difficulties, or simply lack of knowledge about the power of modern econometrics on forecasting. Despite this perceived situation, many marketing managers throughout the world value sales forecasting techniques. In fact, sales are the main variable upon which budgets in all organizational areas are developed, determining the sustainable components of annual marketing investments,
  • 3. Electronic copy available at: http://ssrn.com/abstract=2214543 3 staff recruitment, and other types of investment needed for organizational development, training or maintenance. In a survey directed at marketing managers conducted as far back as 1975, 93% have claimed sales forecasting to be “one of the most critical” or “a very important aspect of their company’s success” (Dalrymple, 1975). The same author concluded that “formal marketing plans are often supported by forecasts” (Dalrymple, 1987). However, astonishingly, most of marketing literature does not approach sales forecasting (Armstrong, et al., 1987). Even to this day, is not easy to access public domain research on sales forecasting, one of the reasons being its private and classified nature. Sales forecasting methodologies can be based on judgmental sources or statistical sources, and sometimes use both types of sources (Armstrong & Collopy, 1998). While judgmental sources are of qualitative nature, statistical sources can be either univariate or multivariate. Univariate statistical sources used in forecasting models use extrapolation methods based on quantitative analogies and rule-based forecasting, mainly. Extrapolation uses historical data, and often exponential smoothing, which attributes more weight to recent data, while rule-based forecasting accounts for judgmental knowledge of the factors influencing sales forecasts. On the other hand, multivariate statistical sources are based on data and rely on econometric models to produce forecasts. Some researchers have concluded that relatively simple extrapolation methods can perform as well as more complex methods (Makridakis, 1984), (Armstrong, 1985), which is not surprising if we assume that simpler extrapolation methods may at times integrate in an heuristic manner management information about the object of forecast, which is not always the case with econometric models data. In fact, it is not simple for time series models, for example, to integrate a manager’s knowledge of context variables hard or impossible to quantify, but nevertheless potentially influencing to a high degree the series in the future. Some quantitative methods display the limitation of intrinsically assuming that the causal forces that influence an historical data series will propagate in the same manner into the future. Therefore, judgmental extrapolations may be more effective than quantitative extrapolations whenever there are known and anticipated sales changes, when there is relevant knowledge that can easily be integrated into judgmental extrapolations, or when there are reasonably clear expectations about the way in which several context factors may affect sales development (for example, political decisions affecting aggregate demand, top management decisions expected to be taken within the
  • 4. 4 forecasting horizon and affecting sales performance, industrial and technological changes, and other types of change that managers know to affect sales). Some econometric models can incorporate, to some extent, decision making and judgmental planning, for example taking into explicit account planned marketing variables, into the forecasting horizon. These models tend to assume a high degree of complexity and require considerable efforts in data gathering, and are not at reach for many firms. Judgmental and univariate extrapolations for sales forecasting are a manager’s obvious choice when substantial amounts of sales data are not available. Judgmental extrapolations rely primarily on good knowledge of context, including product demand, markets, future plans and other factors. Sales extrapolations can also work well if a regular sales behavior is achieved. (Mentzer & Kahn, 1995) argue that extrapolation of the historical sales trend is common in firms. Other authors presented evidence that simple extrapolation-like forecasts were often amongst the most accurate procedures (Schnaars, 1984). Even when there is more uncertainty, conservative criteria may be adopted such as staying close to the historical average, which helps dampen the trend as the horizon increases (Gardner & McKenzie, 1985). However, according to (Baker, 1999), econometric methods are more useful, when: i. Strong causal relationships with sales are expected; ii. These causal relationships can be estimated; iii. Large changes are expected to occur in the causal variables over the forecast horizon, and, iv. The changes in the causal variables can be forecasted or controlled, especially with the respect to their direction. They further argue that in the absence of any of the above conditions, econometric models “should not be expected to improve accuracy.” Comparing different forecasting models and techniques is important, since one often gains in considering more than one method of forecast (Baker, 1999). Furthermore, (Baker, 1999) further states that the selection of forecasting methods can be avoided by combining forecasts, and that fit measures such as R2 are of secondary importance vis-a-vis realistic simulations of the actual situation of a forecaster, e.g. ex ante forecasts can be more important than fit measures. Other researchers have suggested that when well-structured domain knowledge is lacking, equally-weighted averages are as accurate a scheme as any other (Clemen, 1989). Another example of this was provided by (Blattberg & Hoch, 1990), which have
  • 5. 5 reached superior sales forecasts by performing equally weighted averages of judgmental forecasts and quantitative models. There is therefore a literature arrow pointing towards the comparison and combined used of forecasting techniques in order to achieve more balanced, accurate, forecasts. 3. Econometric Model 3.1 Data Set The data used consists of a set of sixty seven observations for four variables in an exports business unit of a Portuguese firm. The data is composed of monthly observations, ranging from February 2007 to August 2012. Currently the business unit handles clients in more than 60 different world markets, developing 67% of its exports to European countries, 13% to Asian countries, 12% to African countries, 7% to American countries and 1% to Oceania. The business unit sells branded disposable paper products, a product category commonly considered to combine a high elasticity of demand (due to the availability of substitute goods in most markets and the typical retail prices low amplitude) with a low geographical range for exports (due to the high impact of transport costs on its pricing structure). Data gathered relates directly with both the business unit modus operandi and focus, and the rationale of the model. Management has trained the sales team to operate in a focused manner, transversal as well as deep and ambitious:  To penetrate as many markets as possible so as to maximize business opportunities and minimize sales vulnerability over time;  To prioritize “optimal” commercial operations combining better chances of providing sustainable sales and business development opportunities; this policy locus is aligned with a variable monthly compensation scheme requiring sales effectiveness;  To use pricing as deemed necessary to achieve penetration, without incurring market risk or losses;  To foster team work as much as role specialization to the maximum possible extent in order to maximize productivity and know-how sharing among team members;  A special focus on establishing and fostering indirect business structures in foreign countries has been key to the continuous development of a global and expanding network of agents, distributors, brokers and other types of business relations. The data gathered has 67 monthly observations for each of the following variables:
  • 6. 6  Sales data of the business unit: “Sales”, the dependent variable (in Euro).  Average prices of sales per unit mass: “Price”, an independent variable (in Euro/Kg).  Number of people on the sales team: “Sales Force”, an independent variable.  Monthly number of sales operations concluded: “Portfolio”, an independent variable. It is possible to look at the all the data gathered in a combined way using a multiple display graphic with left and right scales: It is also possible to inspect to which degree the variables chosen are correlated. The correlation matrix for the variables chosen is: Correlation coefficients, using the observations 2007:02 - 2012:08 5% critical value (two-tailed) = 0.2404 for n = 67 Sales Portfolio Price Salesforce 1.0000 0.6276 -0.4621 0.6315 Sales 1.0000 -0.3714 0.5390 Portfolio 1.0000 -0.3315 Price 1.0000 Salesforce Correlation between variables only indicates their relation magnitude and relation sign, and does not mean causality. The Pearson product-momentum correlation coefficient 0 5 10 15 20 25 30 35 40 2007 2008 2009 2010 2011 2012 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 Sales (right) Portfolio (left) Price (left) Salesforce (left)
  • 7. 7 between two variables and can be written as , where is the covariance between For each sample set of two discrete variables with variations in discrete t moments ranging from 1 to T, we have ∑ ̅ ̅ √∑ ̅ √∑ ̅ . 3.2 Economic and Managerial Rationale of the Explanatory Variables 3.2.1 Pricing Our choice of data and variables has a simple rationale, in some cases deeply supported by existing literature. Alfred Marshall has probably been the first to define the elasticity of demand, through which a mathematical framework has been provided for the understanding of the possible variations of demand with prices (Marshall, 1890). In his own words: “And we may say generally: - the elasticity (or responsiveness) of demand in a market is great or small according as the amount demanded increases much or little for a given fall in price, and diminishes much or little for a given rise in price”. And, more specifically: "the only universal law as to a person's desire for a commodity is that it diminishes... but this diminution may be slow or rapid. If it is slow... a small fall in price will cause a comparatively large increase in his purchases. But if it is rapid, a small fall in price will cause only a very small increase in his purchases. In the former case... the elasticity of his wants, we may say, is great. In the latter case... the elasticity of his demand is small." Marshall defined this elasticity using differential calculus. Generally, elasticity can vary depending on the goods and related markets concerned. The availability of substitute goods tends to increase elasticity (say, the case for disposable paper products), while the specificity of some goods tends to lower it (food and energy, but also branded items for the case of fast moving consumer goods). The higher the disposable income percentage needed to buy a good, the higher the elasticity tends to be, given the purchase decision will tend to be much less discretionary and much more cautious. The products handled by the business unit have a twofold characteristic: on one hand, they are usually considered to be easily replaceable, as there are in most markets, even if emergent, ready available substitute products; on the other hand, a brand factor has been introduced and is changing the nature of the goods, a process leading to easier exports and sales to retailers seeking to differentiate their product assortments in their home markets, even when located at large distances from the producer’s facilities.
  • 8. 8 The econometric model developed will allow us to assess the average elasticity of demand of the underlying products in the set of export markets implicitly included in the sample data, for the considered period in the analysis. 3.2.2 Sales Force Sizing According to (Lilien, et al., 1992), “The size of the sales force is one of the most important decisions facing executives in many industries” and “The specific options chosen – sales force size versus the use of wholesalers, distributors, agents, and so fourth – depend on the relative costs and the selling tasks required”. The sales force sizing decision often uses heuristics like the breakdown – percentage of sales – approach, the workload approach or an industry guideline method (Lilien, et al., 1992). Such heuristics are used in the business unit under study; more precisely a workload approach is regularly used when assessing the need to resize the business unit sales force. The sales force size can also be viewed as a proxy for a group of variables including external aggregate demand and sales force experience. As such, workload is therefore not only a function of the processes involved in the generation and processing of the sales per se, but also a function of experience, training, team work, and personal sales effectiveness of each salesman in the team. Given the small size of the business unit’s sales force, workload assessments result from a direct management follow up of the variations and perceived complexity of processes under development at a given time as well as historically. Such an assessment is performed periodically. Our sample contains several variations on the sales force size, which either reflects salesmen turnover rates or, more commonly, assessments and consequent resizing of the sales force. In the business unit focused, the sales force size is judged to be of critical importance, given the high workloads and longer times typically associated with the development of export operations, usually more complex than sales directed at domestic markets. 3.2.3 Portfolio Sizing Perhaps less portrayed in marketing science literature, client portfolio size is an important variable to maximize sales growth, business opportunities and, simultaneously, reduce a business unit sales vulnerability, and cash flows volatility, over time. (Crina, et al., 2011) have developed an interesting research, using financial portfolio theory to reorganize client portfolios towards profit maximization and cash flows
  • 9. 9 volatility minimization, providing guidelines for incorporating a risk overlay into established costumer management frameworks. The example provided by the aforementioned authors establishes a parallel between the role of portfolio diversification in common business portfolios and the role of the well- known diversification of stock market portfolios, with the same objectives of minimization of risk and maximization of returns. With a similar concept in mind, the business unit management has provided moral incentives for its sales force to secure ever more medium and large clients and business prospects over time, almost abolishing geographical discrimination criteria. These objectives have led to a continuous business seeking attitude, across ever more international markets, as well as to expanding businesses within each national market. As the group attitude set in, this process resulted in a slow, but steady, growing client portfolio trend, and a growing number of medium and large clients within the portfolio, over time. This strategy should at least partly explain why the business unit sales records do not show any particularly strong downturn in the aftermath of the sub-prime crisis, unlike many other export business units. 3.3 Ex Ante Modeling Hypothesis In view of the literature support and managerial rationale presented on the precedent sections, we hypothesize the following relations: I. “Price” should have a negative relation on “Sales”; II. “Sales Force” [Size] should have a positive relation on “Sales”, and III. “Portfolio” [Size] should have a positive relation on “Sales”. As such, regardless the econometric model, the estimated coefficient sign for Price should be negative, and the sign for the estimated Sales Force coefficient should be positive, as well as the sign for the estimated Portfolio coefficient. 3.4 Econometric Model 3.4.1 Model Specification and Estimation According to (Gujarati & Porter, 2009), “broadly speaking, there are five approaches to economic forecasting based on time series data: (1) exponential smoothing methods, (2) single-equation regression models, (3) simultaneous-equation regression models, (4) autoregressive integrated moving average models, and (5) vector auto regression models.“ We have a multiple regression model, specified by:
  • 10. 10 with t =1…T, T=67, and the stochastic disturbance term associated with observation t. Alinearization can be written as: where, It is worthy to mention that the linearized data presents a much smoother evolution in time, as depicted in the graphic below: The implicit assumptions for the use of Ordinary Least Squares method in the multiple regression model are (Gujarati & Porter, 2009): i) Its linearity in the parameters. ii) iii) | iv) | v) ( | ) vi) The number of observations T must be greater than the number of parameters. vii) There must be variation in the values of all independent variables. 0 2 4 6 8 10 12 14 2007 2008 2009 2010 2011 2012 l_Sales l_Portfolio l_Price l_Salesforce
  • 11. 11 viii) There must not be any exact correlation between the independent variables. ix) The model is correctly specified. Compliance with the above assumptions is observed and further discussed in Sections 4 and 6. The estimation of the model (Table.1) using Ordinary Least Squares, shows that all coefficients are significant to less than a 1% significance level, and coefficient signs are as expected. The estimated model equation can be written as: ̂ ̂ with t=1…T, T=67, . Table.1 Model 1: OLS, using observations 2007:02-2012:08 (T = 67) Dependent variable: l_Sales Coefficient Std. Error t-ratio p-value Sig. Constant 9.72829 0.646982 15.04 1.62e-022 *** l_Portfolio 0.681882 0.197953 3.445 0.0010 *** l_Price -0.688531 0.192897 -3.569 0.0007 *** l_Salesforce 0.60029 0.224509 2.674 0.0095 *** Mean dependent var 11.92724 S.D. dependent var 0.669145 Sum squared resid 12.43029 S.E. of regression 0.444192 R-squared 0.579374 Adjusted R-squared 0.559344 F(3, 63) 28.92555 P-value(F) 7.01e-12 Log-likelihood -38.63623 Akaike criterion 85.27247 Schwarz criterion 94.09124 Hannan-Quinn 88.76208 Rho 0.068852 Durbin-Watson 1.817990 What the Ordinary Least Square (OLS) method does is to minimize the sum of squared vertical distances between the observed responses in the data set and the responses predicted by the linear approximation. This process estimates coefficients (“slopes”) relating the explanatory variables to the dependent variable but provides no evidence of causality. The OLS estimator is consistent when the regressors are exogenous (e.g. unexplained by the model) and there is no perfect multi collinearity (perfect linear relations among one or more variables), and optimal in the class of linear unbiased estimators when the errors are homoscedastic (e.g. display constant variance) and serially uncorrelated (e.g. without underlying patterns between variables and across time). Under these conditions, the method of OLS provides minimum-variance, mean- unbiased estimation when the errors have finite variances. Under the additional assumption that the errors are normally distributed (which they are, as shown in Section 6), the OLS is the maximum likelihood estimator (e.g. the best fit to data).
  • 12. 12 From Table.1 we can see the of the regression is of 0.579, meaning the model can explain almost 58% of the variance observed in the sample. This is not an excellent result, but considering we are using only organizational variables easy to gather on any firm, it is a pretty reasonable result. The can be written as , where is the square residuals with respect to the regression, and is the square of the residuals with respect to the average. The adjusted or ̅ ̅ ̅ ̅ is related to the through the relation ̅ ̅ ̅̅ , where T is the sample size and n the number of regressors. Looking at Table.1 above, one can find in the t-ratios a measure of individual coefficient significance. Whenever the for a given coefficient presents a value higher than the for populations with the same characteristics as the sample (in the case for a right tail probability of 2.5%), the null hypothesis (stating that the said coefficient is zero) is rejected. The t-stat for a coefficient can be expressed through ̂ √ ̂ . Thus for the coefficient , for example, we have a 95% probability of it being between , e.g. between 0.287 and 1.077. In the same way, we can present the 95% confidence intervals for the values of all the coefficients in this estimation. Please note none of the intervals contains the value zero, which could render a coefficient to be non-significant. Variable Coefficient 95% Confidence Int. (Min.) 95% Conf. Int. (Max.) Constant 9,72829 8,43540 11,0212 Ln(Portfolio) 0,681882 0,286304 1,07746 Ln(Price) -0,688531 -1,07401 -0,303057 Ln(Salesforce) 0,600290 0,151645 1,04893 The for a given coefficient (see Table.1) indicates the probability of obtaining the same estimated value for that coefficient by chance, assuming it would not be significant (e.g. not reasonably explanatory). The in the table is also of importance. It provides a measure for the global significance of the regression. We have that . The statistics result is basically the ratio between the explained variance and the unexplained variance of the regression (it can be expressed also by ). A high ratio thus means a significant and good fit to data. The corresponding we get from Table.1 is also much smaller than 0.05. The null hypothesis is thus rejected – it stated that all coefficients were zero or that the dependent variable could only be interpreted as a stochastic variation not explained by the variation of the independent variables - and we may conclude the coefficients are
  • 13. 13 significant and different from zero. The low associated with the F-stat of the regression indicates the probability of finding a regression resulting in these values by chance, assuming non-significant coefficients. The coefficients on the log-log model can be interpreted as elasticities (on ceteris paribus circumstances, e.g. varying one coefficient at a time while keeping the others constant). As such, and looking at the estimated model equation, we know that for every 1% increase in Price we are very likely to witness a 0.689% decrease in Sales. The price elasticity, which can be generally written as , where the second fraction if the derivative of the demand curve with respect to price, is the slope of the demand curve with respect to price. When its zero, the demand is said to be perfectly inelastic – the demand is the same, no matter the price; when the elasticity presents values between -1 and 0 (as in our regression), the demand is said to be relatively inelastic (not much sensible to price, especially if the value is closer to zero); when the elasticity value is -1, the demand is said to be unit elastic, e.g. a symmetric, of opposite sense, variation of demand to price; finally, when the elasticity is lower than -1, the demand is said to be relatively elastic or elastic, and we enter the zone in the curve where demand is highly sensitive to price variations. The interpretation of the other elasticities is less straightforward but keeps having managerial sense: a 1% increase in client Portfolio is estimated to have a result of 0,682% increase in Sales, and a 1% increase in Sales force is estimated to have a 0,600% increase in Sales. Therefore, if the Portfolio is around 30 and the management wants it to be around 40 in the short term, a 25% increase, that would likely result in about 17% increase in Sales (0,25 x 0,682). In the same way, if the team is increased to 5 people from 4 – again a 25% increase – Sales would be expected to rise in the short term by about 15% (0,25 x 0,60). A battery of tests has been performed onto the regression (discussed in more detail in Section 4 with detailed estimations presented in Section 6), showing that:  The residual follows a normal distribution;  There is no sign of perfect collinearity among any independent variables;  The RESET test indicates an adequate specification;  The White’s test for heteroscedasticity shows no heteroscedasticity traces;  The Durbin-Watson test indicates no autocorrelation in the error terms;  The Breusch-Godfrey test indicates no autocorrelation in the error terms, and,  The Chow test with a structural break at observation 34 indicates no structural break presence. The results seem to indicate that this model and resulting regression can be used for adequate statistical inference purposes and statistical predictions. Below, the estimated versus observed values for the dependent variable, within a 95% confidence interval.
  • 14. 14 3.4.2 Extrapolations The business unit under focus uses simple univariate regressions embedded in Excel. The methods used are linear, exponential and polynomial regressions of monthly Sales. For comparison purposes, the three methods of regression applied to the same series of Sales (67 observations, ranging from February 2007 to August 2012) result in: ̂ ̂ ̂ All these models are easier to compute than the previous econometric model, but they do not integrate any explanatory variables, and therefore do not allow for any inference and forecasting accounting for variation scenarios in any of the key explanatory factors we considered before, and that we have shown to be statistically significant to explain the dependent variable (sales). Although practical and straightforward to use for managerial purposes, we believe, and our previous model discussion clearly shows, that these simpler regressions could be 9.5 10 10.5 11 11.5 12 12.5 13 13.5 14 2007 2008 2009 2010 2011 2012 95 percent interval l_Sales forecast
  • 15. 15 easily replaced or complemented by the model portrayed in the previous sub-section, with important advantages for management purposes. 4. Tests and Inference In this section we test for possible infractions to the basic assumptions allowing the use of Ordinary Least Squares regression and the interpretation of data as a time series. 4.1 Adequate Specification test We use the Ramsey “RESET” test. The Ramsey test is a general specification test for the linear regression model. More specifically, it tests whether non-linear combinations of the fitted values help explain the response variable. The intuition behind the test is that if non-linear combinations of the explanatory variables have any power in explaining the dependent variable, the model is not adequately specified. The null hypothesis is that of an adequate specification. We have proceeded to a regression including the squares and the cubes of the explanatory variables (see Section 6). The statistic used can be expressed by ~ , where k is the number of new regressors, g the total number of coefficients and N the total number of observations. We have obtained a test statistic, F(2, 61) = 0.137224, with p-value = P(F(2, 61) > 0.137224) = 0.872044. At a significance level of 5%, the value of is larger than the test statistic, the p-value is above 0.05, and therefore we cannot reject the null hypothesis for correct specification. 4.2 Multi collinearity A multiple regression model with correlated explanatory variables can indicate how well the entire bundle of independent variables predicts the outcome variable, but it may not give valid results about any individual variable, or about which variables are redundant with respect to others (e.g. linearly dependent of other variables). We use a formal detection-tolerance or the variance inflation factor (VIF) for multi collinearity. Tolerance can be defined as , where is the coefficient of determination of a regression of variable j on all the other variables (not including the dependent variable). The variance inflation factor (VIF) is defined as A tolerance of less than 0.20 or 0.10 and a VIF of 5 or 10 or above indicates multi collinearity. The VIF factor reflects all other factors that influence the uncertainty in the coefficient estimates. A
  • 16. 16 high VIF value, or a low tolerance value, indicates the degree to which the variance of the underlying regression is inflated, by the effect of multicollinearity (the standard errors of the estimated coefficients are inflated when multi collinearity is present. In our model and regression, the VIF factors of each variable are all between 1.3 and 1.5, indicating no multi collinearity:  l_Portfolio 1.487  l_Price 1.350  l_Salesforce 1.468 4.3 Heteroscedasticity It is important to test for the constancy of the variance of the variables along the sample. A constant variance is designated by homoscedasticity, while a variable variance is known by heteroscedasticity. Heteroscedasticity, when present, means the estimation regressors are not efficient. The regression residuals against the fitted dependent variable are depicted below: The tests we have used to evaluate the presence of heteroscedastic residuals were the White test and also the Breusch-Pagan test (regression of the square residuals on the independent variables, with the null hypothesis of homoscedasticity). -1.5 -1 -0.5 0 0.5 1 10.5 11 11.5 12 12.5 13 l_Sales Regression residuals (= observed - fitted l_Sales)
  • 17. 17 The White test for constant variance is performed through an auxiliary regression, that of the squared residuals from the original regression in function of the original variables, their cross products and their square products. The resulting times the sample size is the Lagrange multiplier, which follows a Chi- squared distribution, with a number of degrees of freedom equal to the number of estimated parameters in the auxiliary regression. The unadjusted of that auxiliary regression is 0.11, while the test statistic is of 7.69 and the Chi-squared(9) of 16.92 for a 5% significance level. The null hypothesis of the test is homoscedasticity (or no heteroscedasticity). Unadjusted R-squared = 0.114886 Test statistic: TR^2 = 7.697350, with p-value = P(Chi-square(9) > 7.697350) = 0.564910. Since the Chi-Square (9) is 16.919 at 5% significance, thus above the test statistic of 7.697, and the p-value is high, we cannot reject the null hypothesis (heteroscedasticity not present). The White’s test using only squares, and the Breusch-Godfrey test present similar results:  White's test for heteroscedasticity (squares only): null hypothesis: heteroscedasticity not present. Test statistic: LM = 6.96137, with p-value = P(Chi-square(6) > 6.96137) = 0.324434.  Breusch-Pagan test for heteroscedasticity. Null hypothesis: heteroscedasticity not present. Test statistic: LM = 4.80767, with p-value = P(Chi-square(3) > 4.80767) = 0.186435. 4.4 Normality of the residuals As shown in more detail on Section 6, the expected value of the random disturbance term is zero | , fitting a normal distribution well: 4.5 Autocorrelation Autocorrelation of the residuals is often a problem in time series models. It can be interpreted as the cross-correlation of a signal with itself, or the similarity between observations across time, a sort of repeated pattern buried into the data. Autocorrelation of the errors can generally be detected because it produces autocorrelation in the observable residuals or error terms. Autocorrelation violates the ordinary least squares (OLS) assumption that the error terms are uncorrelated. While it does not bias the OLS coefficient estimates, the standard errors tend to be
  • 18. 18 underestimated (and the t-scores overestimated) when the autocorrelations of the errors at low lags are positive. The Breusch-Godfrey test for correlation up to the 12th power of the residuals, presents a very high p-value, with a critical value for of 1.947 (at 5% significance) higher than the test statistic LMF of 0.1359. The null hypothesis cannot be rejected (no autocorrelation). The Durbin-Watson statistic was also used, defined as ∑ ∑ where T is the number of observations and the residual. The value of d varies between 0 and 4. The Durbin-Watson statistic for and presents a and a , while the regression Durbin-Watson statistic has a value of 1.8623. This is a value higher than and lower than 4- (2.3012), indicating no statistical evidence that the error terms are negatively or positively correlated (e.g. the disturbance terms are independent). Further estimations are presented on Section 6. 4.6 Model Stability Chow’s test was used to evaluate whether the coefficients of two model regressions using the data with a break at observation 34 are identical or not. The null hypothesis states that the coefficients are the same (no structural break). As further shown in Section 6, we have not found a structural break, and the dynamics of the samples seems to be coherent for the whole time interval taken for the regression. 4.7 Inference and Predictions Since our model complies with the OLS base assumptions – see also Section 6 - we can safely use statistical inference to draw a few conclusions. The 99% confidence ellipse for the Ln(Price) and Ln(Portfolio), the 99% confidence ellipse for Ln(Salesforce) and Ln(Price), and the 99% confidence ellipse for Ln(Portfolio) and Ln(Salesforce), are depicted in the figures below, as well as the observed minus fitted values for the dependent variable, e.g. the regression residuals.
  • 19. 19 Predictions can be made making the regression of the model on sub-section 3.4.1 for a part of the sample. We did it from February 2007 to December 2011, and then used the resulting equation of regression to predict sales from January 2012 to August 2012. Please note that since this particular regression is made over a sub-sample, the coefficients of regression are centered over slightly different values: t(55, 0.025) = 2.004 Variable Coefficient 95 Confidence Interval const 9.77450 (8.37470, 11.1743) l_Portfolio 0.674835 (0.250883, 1.09879) l_Price -0.675421 (-1.08791, -0.262931) l_Salesforce 0.546571 (0.00497455, 1.08817) The predictions for the months of 2012 (January to August) are presented below: -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 l_Portfolio 99% confidence ellipse and 99% marginal intervals 0.682, -0.689 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 -1.4 -1.2 -1 -0.8 -0.6 -0.4 -0.2 0 l_Price 99% confidence ellipse and 99% marginal intervals -0.689, 0.6 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4 l_Portfolio 99% confidence ellipse and 99% marginal intervals 0.682, 0.6 -1.5 -1 -0.5 0 0.5 1 2007 2008 2009 2010 2011 2012 Regression residuals (= observed - fitted l_Sales) 11 11.5 12 12.5 13 13.5 14 2010 2010.5 2011 2011.5 2012 2012.5 l_Sales forecast 95 percent interval
  • 20. 20 Forecasting for future moments can be performed by setting appropriate levels for variables under the control of management or the sales team – in equation of the model estimation presented on sub-section 3.4.1 - such as sales force size and price, and introducing objective levels for portfolio sizing, for example. Forecasting scenarios can be built fixing the values of two independent variables and varying the third (e.g. ceteris paribus) and looking the output result for the dependent variable. 4.8 Dynamic Models To search for the short and long term elasticities of the variables integrating the model developed we have made a regression including the lagged term of the dependent variable (Partial Adjustment Model). However, statistical significance for the lagged term was not found (small t-ratio, high p-value) which seems to indicate that the static model is better. A linear restriction hypothesizing the lagged term to be zero returned a high p-value, and therefore we cannot exclude the null hypothesis, e.g. the lagged term coefficient to likely be equal to zero. OLS, using observations 2007:03-2012:08 (T = 66) Dependent variable: l_Sales Coefficient Std. Error t-ratio p-value Const 8.89535 1.33457 6.6653 <0.00001 *** l_Portfolio 0.605383 0.200736 3.0158 0.00373 *** l_Price -0.667782 0.205363 -3.2517 0.00187 *** l_Salesforce 0.508548 0.234869 2.1652 0.03429 ** l_Sales_1 0.0968393 0.109802 0.8819 0.38127 Mean dependent var 11.94211 S.D. dependent var 0.663020 Sum squared resid 11.71626 S.E. of regression 0.438258 R-squared 0.589964 Adjusted R-squared 0.563076 F(4, 61) 21.94182 P-value(F) 2.95e-11 Log-likelihood -36.60360 Akaike criterion 83.20720 Schwarz criterion 94.15547 Hannan-Quinn 87.53338 rho -0.039909 Durbin's h -0.691776 Further estimation of the dependent variable on all the independent variables, their lagged terms and the lag term of the dependent variable (Autoregressive Distributed Lag Model), has not shown to present meaningful significance for most coefficients. OLS, using observations 2007:03-2012:08 (T = 66) Dependent variable: l_Sales Coefficient Std. Error t-ratio p-value Const 9.21283 1.53228 6.0125 <0.00001 *** l_Portfolio 0.59641 0.221189 2.6964 0.00916 ***
  • 21. 21 l_Portfolio_1 0.0522092 0.243777 0.2142 0.83117 l_Price -0.645508 0.22092 -2.9219 0.00495 *** l_Price_1 -0.0467767 0.223199 -0.2096 0.83473 l_Salesforce -0.217245 0.71804 -0.3026 0.76331 l_Salesforce_1 0.780354 0.733461 1.0639 0.29177 l_Sales_1 0.056814 0.128607 0.4418 0.66030 Mean dependent var 11.94211 S.D. dependent var 0.663020 Sum squared resid 11.48317 S.E. of regression 0.444956 R-squared 0.598121 Adjusted R-squared 0.549618 F(7, 58) 12.33172 P-value(F) 1.52e-09 Log-likelihood -35.94047 Akaike criterion 87.88094 Schwarz criterion 105.3982 Hannan-Quinn 94.80283 Rho -0.032556 Durbin-Watson 2.053078 Furthermore, the p-values of all linear restriction tests hypothesizing each lagged term to be zero have returned values well above 0.05, which means we cannot reject the null hypothesis for each, e.g. no lagged term is relevant per se. A joint partial significance test for all lagged terms coefficients to be zero also returns a much higher than 0.05 p- value, indicating such coefficients do not have a joint partial significance. The null hypothesis proposing all coefficients of the lagged terms to be zero simultaneously cannot be rejected. This indicates that the static model continues to seem better than this dynamic variant. 5. Conclusions We have discussed an econometric model that reasonably explains the sales behavior of an exports business unit of a Portuguese firm and shown such a model to be of straightforward implementation and based on accessible data gathering. Well specified econometric models provide important measures of confidence for the regression coefficients, allowing for more informed forecasting and decision making. The model developed could easily replace or complement currently in use univariate regressions with advantages to the confidence levels of forecasts and the understanding of the explanatory role of significant variables on sales. In fact, the econometric model also fits the data better than any of the univariate regressions currently used at the firm, and can be used to build forecasting scenarios with more probable accuracy varying for the independent variables on a ceteris paribus base and observing the outcome on the dependent variable. This study contains limitations as well as opportunities for further research. An obvious limitation could be the exclusion of macroeconomic variables that could help explain demand, and therefore the dependent variable, to a higher degree. However, such a consideration should be more important for business units focusing on homogeneous
  • 22. 22 markets (national markets), since under such circumstances demand should be more closely linked to a GDP per capita, average per capita consumption of the specific goods sold, or both. Competitors pricing could be added to the model but it is not easy to include for such a variable, since the data is not available (our price is the price to intermediaries per unit mass, not a retail price). Promotional and advertising expenditures could also be added (however, in our case, such expenses are mainly an investment of international distributors, and therefore a variable which is both harder to obtain and outside the control of the business unit management). Another limitation, since the model developed can only explain about 55% of the dependent variable behavior, lies in its relative usefulness to predict the future of distant periods, as is almost always the case with forecasting techniques. We can only recommend forecasts for more six to twelve months into the future, due to the complex dynamics inherent to sales and the resulting uncertainty pending over any sales forecasting model. On the other hand, an opportunity for further research could be attempting to further generalize the econometric model developed, in order to enable it to forecast the sales of different SMEs operating in the FMCG sector (Fast Moving Consumer Goods), and in particular Portuguese FMCG SMEs, which more or less share quite a number of structural characteristics. We also believe the model could be generalized, or easily modified, to explain the sales of exports units in other sectors as well. Another possible opportunity for further research lies in the study of price elasticity over time and its explanatory power, or use, to brand valuation purposes. It is interesting to note that our log-log model provides a measure of the combined elasticity for the products sold to several international markets. Overall, the coefficient for price – e.g. the price elasticity - seems to be leaning towards a relatively inelastic demand, which is surprising and very interesting considering the goods sold are easy to replace. This seems to be consistent with the brand factor introduced in the goods and allowing the business unit to perform exports to faraway markets (a case study of its own in its business sector). 6. Annex All model estimations used Gretl 1.9.10 (5.11.2012 version). The univariate regressions were an output of Excel. 6.1 Assumptions for using the OLS with a Time Series Interpretation We can immediately state that the form of the model presented in Section 3 is linear (after applying the natural logarithmic function), the number of observations (67) is greater than the number of parameters (4), and there is variation in the values of all independent variables. The other assumptions need specific testing shown in throughout this section and identified below:
  • 23. 23 i) Its linearity in the parameters.  Form of the model in the ind. variables. ii)  Hausman test. iii) |  Conditional normality of residuals. iv) |  Tests for heteroscedasticity. v) ( | ) Test for autocorrelation. vi) The number of observations T must be greater than the number of parameters.  T=67, 4 parameters (including the constant). vii) There must be variation in the values of all independent variables.  Data. viii) No exact correlation between the independent variables.  Multi collinearity test. ix) The model is correctly specified.  RESET test. 6.1 Normality of the Random Disturbance Term We also know that the expected value of the random disturbance term is zero | (see plots below), and that it fits a normal distribution well: Frequency distribution for uhat1, obs 1-67, number of bins = 9, mean = 5.03743e-016, sd = 0.444192. interval midpt frequency rel. cum. < -1.1659 -1.2969 2 2.99% 2.99% * -1.1659 - -0.90393 -1.0349 0 0.00% 2.99% -0.90393 - -0.64197 -0.77295 3 4.48% 7.46% * -0.64197 - -0.38000 -0.51099 6 8.96% 16.42% *** -0.38000 - -0.11804 -0.24902 12 17.91% 34.33% ****** -0.11804 - 0.14393 0.012943 20 29.85% 64.18% ********** 0.14393 - 0.40589 0.27491 13 19.40% 83.58% ****** 0.40589 - 0.66785 0.53687 7 10.45% 94.03% *** >= 0.66785 0.79884 4 5.97% 100.00% ** Test for null hypothesis of normal distribution: Chi-square(2) = 5.662 with p-value 0.05894. 0 0.2 0.4 0.6 0.8 1 1.2 -1.5 -1 -0.5 0 0.5 1 uhat1 uhat1 N(5.0374e-016,0.44419) Test statistic for normality: Chi-square(2) = 5.662 [0.0589] -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Normal quantiles Q-Q plot for uhat1 y = x
  • 24. 24 The Chi-Square for 2 degrees of freedom and a significance level of 5%, has a critical value of 5.99146, above 5.662. The p-value is also above 0.05. The null hypothesis cannot be rejected. 6.2 Heteroscedasticity White’s test for heteroscedasticity shows that the variance of the disturbance term is basically constant. White's test for heteroscedasticity with null hypothesis: heteroscedasticity not present. Test statistic: LM = 7.69735 with p-value = P(Chi-square(9) > 7.69735) = 0.56491. White's test for heteroscedasticity, OLS, using observations 1-67, dependent variable: uhat^2. coefficient std. error t-ratio p-value const -4.80831 3.75831 -1.279 0.2059 l_Portfolio 2.65727 1.84708 1.439 0.1557 l_Price 1.57591 1.81795 0.8669 0.3897 l_Salesforce 1.18288 4.07945 0.2900 0.7729 sq_l_Portfolio -0.460071 0.294185 -1.564 0.1234 X2_X3 -0.223413 0.455339 -0.4907 0.6256 X2_X4 0.308763 0.945163 0.3267 0.7451 sq_l_Price -0.311194 0.284098 -1.095 0.2780 X3_X4 -0.347550 1.22089 -0.2847 0.7769 sq_l_Salesforce -1.05806 1.11451 -0.9494 0.3465 Unadjusted R-squared = 0.114886 Test statistic: TR^2 = 7.697350, with p-value = P(Chi-square(9) > 7.697350) = 0.564910. Since the Chi-Square (9) is 16.919 at 5% significance, thus above the test statistic of 7.697, the p-value is high and we cannot reject the null hypothesis (heteroscedasticity not present). 6.3 Multi collinearity We can also state that there are no traces of perfect collinearity among the independent variables. This has been assessed through the Variance Inflation Factors calculus, which provides an index measuring how much variance of an estimated regression coefficient increases because of collinearity. No coefficient presents a VIF above 10. Variance Inflation Factors Minimum possible value = 1.0 Values > 10.0 may indicate a collinearity problem
  • 25. 25 l_Portfolio 1.487 l_Price 1.350 l_Salesforce 1.468 VIF(j) = 1/(1 - R(j)^2), where R(j) is the multiple correlation coefficient between variable j and the other independent variables. 6.4 Autocorrelation There is no autocorrelation of the disturbance term as the Breusch-Godfrey test shows below. Breusch-Godfrey test for autocorrelation up to order 12, OLS, using observations, 2007:02-2012:08 (T = 67), dependent variable: uhat. coefficient std. error t-ratio p-value const -0.216662 0.802325 -0.2700 0.7882 l_Portfolio 0.0735869 0.253416 0.2904 0.7727 l_Price 0.0180643 0.228217 0.07915 0.9372 l_Salesforce -0.0311841 0.267454 -0.1166 0.9076 uhat_1 0.0798734 0.141615 0.5640 0.5752 uhat_2 -0.107188 0.141800 -0.7559 0.4532 uhat_3 0.0342331 0.143921 0.2379 0.8129 uhat_4 0.0513473 0.143277 0.3584 0.7215 uhat_5 -0.0578235 0.147794 -0.3912 0.6972 uhat_6 -0.00449540 0.142725 -0.03150 0.9750 uhat_7 -0.0525654 0.145586 -0.3611 0.7195 uhat_8 -0.000852480 0.144410 -0.005903 0.9953 uhat_9 0.0149031 0.144067 0.1034 0.9180 uhat_10 0.0599730 0.146162 0.4103 0.6833 uhat_11 0.0252232 0.149431 0.1688 0.8666 uhat_12 -0.0493138 0.155429 -0.3173 0.7523 Unadjusted R-squared = 0.030998 LM test for autocorrelation up to order 12 - Null hypothesis: no autocorrelation - Test statistic: LMF = 0.135955, with p-value = P(F(12,51) > 0.135955) = 0.999695. The critical value for (at a 5% significance level) is far higher than 0.135955, the p-value is close to 1. The null hypothesis of no autocorrelation cannot be rejected. Other alternative statistics chosen in this test context also result in high p-values (no autocorrelation):  Test statistic: LMF = 0.135955, with p-value = P(F(12,51) > 0.135955) = 1;
  • 26. 26  Alternative statistic: TR^2 = 2.076849, with p-value = P(Chi-square(12) > 2.07685) = 0.999; and,  Ljung-Box Q' = 2.15796, with p-value = P(Chi-square(12) > 2.15796) = 0.999. Finally, the Durbin-Watson statistic for and presents a and a , while the regression Durbin-Watson statistic has a value of 1,8623. This is a value higher than and lower than 4- (2,3012), confirming no autocorrelation in the error terms. 6.5 Model Specification The model is well specified as the RESET test shows below. RESET test for specification - Null hypothesis: specification is adequate - Test statistic: F(2, 61) = 0.137224, with p-value = P(F(2, 61) > 0.137224) = 0.872044. At a significance level of 5%, the value of is larger than the test statistic, the p-value is above 0.05, and we cannot reject the null hypothesis for correct specification. The auxiliary regression for the RESET specification test is: Auxiliary regression for RESET specification test, OLS, using observations 1-67, dependent variable: l_Sales. coefficient std. error t-ratio p-value const -147.736 683.209 -0.2162 0.8295 l_Portfolio -18.2850 79.6310 -0.2296 0.8192 l_Price 18.4218 80.3907 0.2292 0.8195 l_Salesforce -16.2096 70.0333 -0.2315 0.8177 yhat^2 2.27325 10.0222 0.2268 0.8213 yhat^3 -0.0616774 0.286432 -0.2153 0.8302 Test statistic: F = 0.137224, with p-value = P(F(2,61) > 0.137224) = 0.872. The same test types but only with squares and only with cubes also return a good model specification.  RESET test for specification (squares only), Test statistic: F = 0.231643, with p- value = P(F(1,62) > 0.231643) = 0.632; and,  RESET test for specification (cubes only), Test statistic: F = 0.226464, with p- value = P(F(1,62) > 0.226464) = 0.636. 6.6 Structural Break Test (Stability) Chow’s test shows no structural breaks in the model when a test for a break at observation 34 is introduced. Chow test for structural break at observation 34 - Null hypothesis: no structural break
  • 27. 27 Test statistic: F(3, 60) = 0.434218, with p-value = P(F(3, 60) > 0.434218) = 0.729285. At a significance level of 5%, the value of is greater than the test statistic, the p-value is therefore way above 0.05 and we cannot reject the null hypothesis for no structural break in the model. 6.7 Residual Analysis The plots of the residuals in function of each variable are presented below. The residuals of actual versus predicted sales (in logs) are plotted below: -1.5 -1 -0.5 0 0.5 1 10.5 11 11.5 12 12.5 13 l_Sales Regression residuals (= observed - fitted l_Sales) -1.5 -1 -0.5 0 0.5 1 2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 l_Portfolio Regression residuals (= observed - fitted l_Sales) -1.5 -1 -0.5 0 0.5 1 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 l_Price Regression residuals (= observed - fitted l_Sales) -1.5 -1 -0.5 0 0.5 1 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 l_Salesforce Regression residuals (= observed - fitted l_Sales) 10 10.5 11 11.5 12 12.5 13 10.5 11 11.5 12 12.5 predicted l_Sales actual = predicted
  • 28. 28 6.8 Normality of the Dependent Variable The normality of the dependent variable can be visually assessed by the means of a Q-Q plot or a density of distribution. This property is shown as complementary to the base data as it is not a base assumption for OLS correct application on regressions. Frequency distribution for l_Sales, obs 1-67, number of bins = 9, mean = 11.9272, sd = 0.669145. interval midpt frequency rel. cum. < 10.326 10.151 3 4.48% 4.48% * 10.326 - 10.677 10.502 0 0.00% 4.48% 10.677 - 11.027 10.852 2 2.99% 7.46% * 11.027 - 11.377 11.202 8 11.94% 19.40% **** 11.377 - 11.727 11.552 13 19.40% 38.81% ****** 11.727 - 12.077 11.902 14 20.90% 59.70% ******* 12.077 - 12.428 12.253 5 7.46% 67.16% ** 12.428 - 12.778 12.603 18 26.87% 94.03% ********* >= 12.778 12.953 4 5.97% 100.00% ** Test for null hypothesis of normal distribution: Chi-square(2) = 5.250 with p-value 0.07245. The Chi-Square for 2 degrees of freedom and a significance level of 5%, has a critical value of 5.991, above 5.250. The p-value is also above 0.05. The null hypothesis cannot be rejected (normal distribution). 7. References Armstrong, J., 1985. Long-Range Forecasting: From Crystal Ball to Computer. New York: John Wiley. Armstrong, J., Brodie, R. & McIntyre, S., 1987. Forecasting methods for marketing. International Journal of Forecasting, Volume 3, pp. 355-76. Armstrong, J. & Collopy, F., 1998. Integration of statistical methods and judgment for time series forecasting: principles from emprirical research., s.l.: s.n. 10 10.5 11 11.5 12 12.5 13 13.5 14 10 10.5 11 11.5 12 12.5 13 13.5 14 Normal quantiles Q-Q plot for l_Sales y = x 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 10 10.5 11 11.5 12 12.5 13 13.5 14 l_Sales l_Sales N(11.927,0.66915) Test statistic for normality: Chi-square(2) = 5.250 [0.0724]
  • 29. 29 Baker, M., 1999. Sales Forecasting. In: IEBM Encyclopedia of Marketing. s.l.:International Thompson Business Press, pp. 278-290. Banco de Portugal, 2011. Aeconomia portuguesa em 2011. Lisboa: Departamento de Estudos Económicos. Blattberg, R. & Hoch, S., 1990. Database models and managerial intuition: 50 per cent model + 50 per cent manager. Management Science, Volume 36, pp. 887-99. Clemen, R., 1989. Combining forecasts: A review and annotated bibliography. International Journal of Forecasting, Volume 5, pp. 559-83. Crina, O., Bolton, R., Michael, D. & Walker, B., 2011. Balancing Risk and Return in a Costumer Portfolio. Journal of Marketing, Volume 75, pp. 1-17. Dalrymple, D., 1975. Sales forecasting: methods and accuracy.. Business Horizons, pp. 69-73. Dalrymple, D., 1987. Sales forecasting practices: Results from a U.S. survey.. International Journal of Forecasting, Volume 3, pp. 379-91. Gardner, E. & McKenzie, E., 1985. Forecasting trends in time series. Management Science, Volume 31, pp. 1237-46. Geweke, J., Horowitz, J. & Pesaran, H., 2008. Econometrics, The New Palgrave Dictionary of Economics. s.l.:Palgrave Macmillan. Gujarati, D. & Porter, D., 2009. Basic Econometrics. 5th ed. s.l.:McGraw-Hill/Irwin. Lilien, G., Kotler, P. & Moorthy, K., 1992. Marketing Models. s.l.:Prentice Hall, Inc. Makridakis, S., 1984. The Forecasting Accuracy of Major Time Series Methods. New York: John Wiley. Marshall, A., 1890. The Principles of Economics. London: Macmillan and Co. Ltd. Mentzer, J. & Kahn, K., 1995. Forecasting technique familiarity, satisfaction, usage, and application. Journal of Forecasting, Volume 14, pp. 465-76. Rumelt, R., 1991. How much does industry matter? Strategic Management Journal, 12(3), pp. 167-181. Schnaars, S., 1984. Situational factors affecting forecasting accuracy. Journal of Marketing Research, Volume 21, pp. 290-297. The author would like to thank Professor Elias Soukiazis, of Coimbra Faculty of Economics, for having reviewed this paper and for proposed additional diagnostic tests to the models presented.