Developer Data Modeling Mistakes: From Postgres to NoSQL
2011 02-04 - d sallier - méthode kenza
1. Long-term Demand Forecast:
The Kenza Approach
Prepared & presented by Daniel SALLIER
Air Traffic Data & Forecasting Director
Porto, ATRS 2010
2. 2
Content
A. Forewords
C. The 1st Kenza law of consumer demand - an empirical approach
E. Long term demand elasticity change
G. Kenza approach vs more classical ones
I. Discussion & conclusion
Paris, ENGREF 4 février 2011
4. Air Transport, an industry with strong expectations for
4
accurate (very) long-term forecast
With the noticeable exception of the airlines for which a 3 to 5 years ahead
market visibility is far sufficient, most of the other actors of this industry need
a 15 to 20 years visibility:
– Aircraft & engine manufacturers: further to project development lead
times (10 years a very minimum) and long production times before any
possible return on investment;
– Airports & ATCs: same reasons as for the aircraft and engine
manufacturers;
– Credit export organisations & bankers: risk assessment on residual
values (for aircraft type and portfolio of aircraft);
– Civil Aviation Authorities & Civil Aviation related organisations (i.e. ICAO,
IATA, ACI, European Commission, etc.): rules enactment, market regulation, state
support, infrastructure financing.
Paris, ENGREF 4 février 2011
5. General methodological background and related
5
drawbacks
Most of the demand/traffic models worldwide and industry-wide are based on
econometrical models.
Econometrical modellers are faced with 3 conditions they can hardly meet for
long-term forecasting … most of the time:
Twice as long rule of the thumb.
Time series should be about twice as long as the forecasting horizon: 40 years times
series for a 20 years ahead forecast.
Steady behaviour of the consumers over the time.
Further to the social and economical behaviour inertia of any population it is a valid
assumption to be done for short and mid term forecast (up to 5 years ahead). It is a
far more dubious assumption on longer term issues.
“Clean” time series.
Unfortunately the probability is pretty great that events (epidemics, wars, terrorist
attacks, paranoid government, etc.) “contaminate” the available historical data
translating into a behaviour (transient) change of the consumer habits which
demands the corresponding data to be disregarded (outliers).
Paris, ENGREF 4 février 2011
6. 6
Twice as long rule: The US domestic market case, one of
the best cases of data availability in the world!
Both 8 demand models demonstrate
excellent calibration characteristics over
the same common 1982-1990 period of
time which cannot allow us to identify
the best one(s) among them.
A backward traffic forecast over the
1991-2009 period of time allows us to
identify:
the linear model in real US dollars,
GDP, price & population based, as
the best candidate for providing
rather accurate 10 years ahead
forecast (1991-2000).
Most of the models for short/mid
term 5 years ahead forecast (up to
1996)
GDP/capita based models do not
provide good forecasting models even
for short and medium term forecasting.
Paris, ENGREF 4 février 2011
7. 7
“Clean” dataset issue: The US domestic market case, one
of the best cases of data availability in the world!
A. 1975-2009: 34 years of traffic history and
35 annual data available.
C. 1975-1980: pre-deregulation period: 6
years of un-usable traffic data.
E. 1991-1994: Gulf War the first and its
aftermaths: 4 years of un-usable traffic
data.
G. 2001-2004: Sept. 11 attacks + Afghanistan
war + Gulf War the second + SRAS: 4
years of un-usable traffic data.
I. 2008…: The sub-prime crisis turning into a
worldwide financial crisis, turning into a
severe economical recession: 2 years of
hardly-usable traffic data.
19 “usable” traffic data left
over a total of 35.
Paris, ENGREF 4 février 2011
8. Behavioural modelling: a way to provide the information
8
missing in time series
Most of the time the 3 prerequisites:
Twice as long rule of the thumb.
Steady behaviour of the consumers over the time.
“Clean” time series.
… for a fully “orthodox” use of econometric models cannot be met for long-
term forecasting. But, isn’t better to get something than having nothing to
play with?
A behavioural approach which models consumers’ behaviour should turn out
to be much less “data-thirsty” since part of the missing information in the
time series for a “classical” approach is provided by the behaviour modelling.
Paris, ENGREF 4 février 2011
9. Behavioural modelling: a way to provide the missing
9
information in time series (continued)
t1
t 2
The comet later
2 months appears in the sky
First measurements
Latest measurements
Newton law of motion…
t3
Anticipation of the
M1 ⋅ M 2
F =C⋅
comet position using
“classical” approach
t3
R2
… is not and has never been
Anticipation of the
comet position using a matter of correlations
a “behavioural” approach:
Newton law of motion. between F, M1, M2 and R
Paris, ENGREF 4 février 2011
10. 10
The 1st Kenza law of consumer demand;
an empirical approach
Paris, ENGREF 4 février 2011
11. Why people are flying?
11
3 very basic ideas for a very simple question
People who pay for their ticket are flying because:
• They feel like or they have too;
• They can afford it;
• They are sensitive to the relative ticket/inclusive tour price to their
revenue;
People flying with a company paid ticket belongs to those among the better
paid in the company. It simply means that annual individual income could be a
good indicator of their ability to be paid business trips.
Paris, ENGREF 4 février 2011
12. Demand formation
12
1st formulation
A. Ticket (inclusive tour) price should determine a threshold of minimum income;
C. Given the distribution of income and the threshold of income it is possible to
determine the part of the population which can afford to fly: the elected population;
E. Only part of the elected population actually buys a ticket (inclusive tour): actual
customers;
G. Each customer buys an average number of flights which sum up in the total
demand.
Paris, ENGREF 4 février 2011
13. Demand formation
13
2nd formulation; data normalisation
Price & population normalisation:
Instead of using actual monetary values, we use price/revenue related to a normalisation
quantity such as the average revenue per capita or the GDP per capita. Instead of using
population related cumulative distribution of income, we use % of total population related
cumulative distribution of normalised income: so called Kenza distribution;
D. Advantage #1: no inflation bias in the model (a normalised price or revenue keeps the same if computed
out of real or current values), no currency exchange rate bias;
F. Advantage #2: Kenza distribution very steady over the time which is very convenient for
long-term forecasting.
Paris, ENGREF 4 février 2011
14. Kenza distributions
14
data source
As a general trend, 3 sources for household
income data exist:
– Census organisations such as the US
Census Bureau.
– National statistics administrations such as
DESTATIS in Germany, INSEE in France,
Statistiques Canada in Canda, etc.;
– Specific organisations such as Banca
d’Italia in Italy or DWP (Department of Work
and Pension) in UK.
Most commonly 2 formats of data:
– Double-entry tables which cross income
categories and household categories such as
the one disseminated by the US Census
Bureau;
– Individual data tables which provide
household characteristics such as income,
number of members, etc. for each household
of the sample.
100% of the time household income data
result from a population survey
Paris, ENGREF 4 février 2011
15. Kenza distributions
15
Pretty steady over the time; the US case
The US Kenza distributions prove to be
very steady over the time while, at the
same time, US household structure and
income could have significantly changed:
steadiness is a macro phenomenon it is
not a micro one;
The US case is very far from being
unique. (i.e.: Italy, San salvador, even
Poland);
It is a very convenient characteristic for long
term forecasting!
Paris, ENGREF 4 février 2011
16. 16
1st formulation of the Kenza law of demand
D = P ⋅ K1,1 ⋅ K1, 2 ⋅ F ( K 2 ⋅ pn )
*
Paris, ENGREF 4 février 2011
17. 2nd formulation of the Kenza law of demand
17
The “percolation” principle
D = P ⋅ K1,1 ⋅ K1, 2 ⋅ F * ( K 2 ⋅ pn )
D ( t ) = P ( t ) ⋅ K1 ⋅ F * ( K 2 ⋅ pn ( t ) )
We assume that K1,1, K1,2 and K2 are constant over the time, but differ from one set of populations to
another. K1,1 and K1,2 are concatenated into a single constant K1
Such an assumption is equivalent of assessing that consumer behaviour is fully characteristic of a given
population and not likely to change over the time or, stated a different way, that people tend to adopt the
same consumption habits as those of their “adjacent” upper social class as soon as they join it. It is a
“percolation” social behaviour proposed and detailed by James Stemble DUESENBERRY in "Income,
Saving and the Theory of Consumer Behavior" Harvard University Press – January 1949.
Paris, ENGREF 4 février 2011
18. 18
Some examples of very steady phenomenon over the time
The relationship between Paris (air)
traffic and the French GDP (same
between US domestic air traffic and US GDP)
Kenza distribution of
income in the USA
Ground transportation: Andreas SCHÄFER and David G. VICTOR, 2000,
in "The Future Mobility of the World Population", Transportation Research A,
Volume 34, issue 3: pp 171-205 : "On average a person spends 1.10 h per
day traveling and devotes a predictable fraction of income to travel. We show
that these time and money budgets are stable over space and time and can
be used for projecting future levels of mobility and transport mode.”
Paris, ENGREF 4 février 2011
19. 19
US domestic market as a test bench
The US domestic (air) market is the
best documented traffic in the world
which makes it a “natural” test bench.
The linear regression between the
actual traffic data and its Kenza
estimate over the uneventful periods
1981-1990, 1994-2000 and 2005-2008
results in a R²=0.975.
It looks like the Kenza approach
delivers what it has been designed for:
ability to provide long term forecast
out of a limited number of historical
data.
Paris, ENGREF 4 février 2011
20. 20
Long term demand
elasticity change
Paris, ENGREF 4 février 2011
21. 21
Kenza intrinsic elasticity
Let us define Kenza intrinsic elasticity of a Kenza distribution as:
dF ( rn ) rn
*
ε K ( rn ) = * ⋅ ≤0
F ( rn ) drn
This quantity is related to the Kenza distribution shape out of any direct
considerations on demand functions.
It can be proved that the intrinsic elasticity is completely independent of the
normalisation quantity which is used (i.e. GDP/capita, average individual income, median
individual income, etc...) and, thus, is fully characteristic of the population which is
studied.
The demand elasticity to price is equal to:
ε p = ε K ( K 2 ⋅ pn )
Paris, ENGREF 4 février 2011
22. Kenza intrinsic elasticity as a function of the elected
22
population: the long term market maturation process
Analysis of actual Kenza distributions of
several countries, such as the USA,
shows quite large sections of an almost
linear relationship between the intrinsic
elasticity and the corresponding elected
population (sorted by decreasing level
of income).
The 1st Kenza law of demand takes
"naturally" into account the long term
market rather linear maturation process
(demand elasticity to price decreasing in
absolute value). The market maturation
process is not related to the market
which is studied. It means, in our
example, that it has nothing to do with
the US domestic air market itself. It
exclusively results from the way
individual incomes are distributed with
the considered population . The only
phenomenon which is market related is
how fast the elected population grows
or decreases over the time .
Paris, ENGREF 4 février 2011
23. 23
Kenza approach vs
more classical ones
Paris, ENGREF 4 février 2011
24. 24
Kenza vs econometrical models
8 econometrical models are considered which are detailed hereafter:
All the models, Kenza model included, are calibrated on the same
1982-1990 period of the US domestic (air) traffic.
Paris, ENGREF 4 février 2011
25. 25
US domestic market for methodological benchmarking
All the models present
excellent calibration
characteristics
Paris, ENGREF 4 février 2011
26. US domestic market for methodological benchmarking
26
(continued)
GDP/capita based models
do not provide any
adequate forecasting
accuracy even for short
and mid-term purposes.
The Kenza model and the
linear one based on GDP,
and prices stated in real
dollar provide a good 10
years ahead forecast.
Only the Kenza model
proves decent accuracy
characteristics up to 18
years ahead (within the frame of
this example, of course).
Paris, ENGREF 4 février 2011
27. US domestic market for methodological benchmarking
27
model parameters stability
A behavioural model should be rather insensitive to the selected calibration period. That is exactly
what the Kenza model proves to be, at least within the frame of this example.
Paris, ENGREF 4 février 2011
28. US domestic market for methodological benchmarking
28
the “added” value of the Kenza distribution
The simplified Kenza model is
built considering a linear
section of the Kenza
distribution over the
calibration period, linear
section which is extrapolated
in the future.
The difference between the
Kenza model and the
simplified Kenza model
provide the added value,
mostly for long term
purposes, of using the Kenza
distribution; a non linear
effect.
Paris, ENGREF 4 février 2011
29. 29
Discussion &
conclusion
Paris, ENGREF 4 février 2011
30. 30
Is it a “mathematical freak”? …
A. A Kenza model is based on the use of a S-shaped curve – the Kenza distribution –
which can be stretched/shrunk horizontally – the K2 parameter – for adjusting the
trend slope and which can be stretched/shrunk vertically – the K1 parameter – for
adjusting the absolute volume. The probability is high, whatever the S-curve
selected, to find a couple (K1 , K2) parameters which allows a fine “tuning” of a Kenza
estimate over the actual historical data;
K2
effect
K1
effect
Paris, ENGREF 4 février 2011
31. … or does it sustain its claim of being a behavioural
31
approach?
The S-curve approach can
explain the short and mid-
term forecasting
capabilities which are
confirmed by the simplified
Kenza approach;
But it cannot explain the
long-term forecasting
capability since the
calibration process does
not take into account future
non linear corrections
introduced by the Kenza
distribution…
So, we keep claiming it is a
behavioural approach and
not a “mathematical freak”.
Paris, ENGREF 4 février 2011
32. 32
The very “disturbing” K1 and K2 being constant assumption
A. Assuming K1 and K2 constant over the time is quite a strong assumption to do.
C. Examples of very steady phenomenon over very long period of time can be quoted
and have been quoted (i.e. ground transportation time, air traffic vs GDP, etc.). It just proves that it
is possible, it does not prove that it is the case.
E. Long-term forecasting capability of a Kenza model should be considered as an
indirect clue only.
G. Additional research works, yet to be published, based on the aggregation of
individual probabilistic consuming behaviour, fully sustain the K1 and K2 being
constant over the time assumption.
Paris, ENGREF 4 février 2011
33. Conclusions
33
Kenza modelling of consumer demand proves:
to provide rather accurate long-term forecast which is illustrated in the case of the
US domestic air market;
to be a pretty robust model of which the low sensitivity of the parameters to changes
in the calibration period could be considered as an indirect confirmation of the
"behavioural" nature of its modelling. And, as a side product, it gives access to the
demographic, social and economical background of a market which proves to be a
valuable asset mostly when presenting the outputs to a top management meeting. It
is far more comfortable coming with explanations than correlations!;
To provide an unanticipated ability at the time of the method development to provide
additional valuable pieces of information such as the intrinsic elasticity and how it
directly drives the long term evolution of demand elasticities and may provide clues
on the market maturation process.
To be a more general approach than just modelling air demand and can be extended
to consumer demand issues.
Kenza approach was initially developed for the needs of the Marketing Directorate of
Airbus Industrie and has been the primary annual demand forecasting tool of
Aéroports de Paris since 2003.
Paris, ENGREF 4 février 2011
34. 34
What’s Kenza for?
Thank you
It’s my daughter’s name, for
…
this kind of models are like
children, you have to take
care of them all your life long!
Paris, ENGREF 4 février 2011