1. Probabilistic demand forecasting Prepared & presented by Daniel SALLIER Traffic Data & Forecasting Director Aéroports de Paris [email_address] 01 70 03 45 68
2. Content <ul><li>Foreground </li></ul><ul><ul><li>The "classical" forecasting approach </li></ul></ul><ul><ul><li>Drawbacks of the "classical" forecasting approach </li></ul></ul><ul><ul><li>2 generic sources of uncertainty in any forecast </li></ul></ul><ul><li>How to cope with the intrinsic technical uncertainty </li></ul><ul><ul><li>What we are looking for … </li></ul></ul><ul><ul><li>Let's go back to the very basics </li></ul></ul><ul><ul><li>Step #1: model determination </li></ul></ul><ul><ul><li>Step #2: determination of the law of probability of the models parameters </li></ul></ul><ul><ul><li>Step #3: determination of the law of probability of the models output: Y </li></ul></ul><ul><ul><li>Step #4: determination of the law of probability of the future values </li></ul></ul>
3. Content (continued) <ul><ul><li>The data agregation / break-up issue </li></ul></ul><ul><ul><li>The data agregation issue </li></ul></ul><ul><ul><li>The data break-up issue </li></ul></ul><ul><li>Part of the prospective uncertainty: the residual issue </li></ul><ul><ul><li>What are residuals? </li></ul></ul><ul><ul><li>Taking into account part of the prospective risk </li></ul></ul><ul><li>Further developments and applications </li></ul><ul><ul><li>Vertical cuts for most of the short term utilisation </li></ul></ul><ul><ul><li>Horizontal cuts for most of the mid & long term utilisation </li></ul></ul><ul><li>Conclusions </li></ul><ul><ul><li>So many advantages, so few drawbacks </li></ul></ul>
4. Foreground
5. The "classical" forecasting approach <ul><li>Econometrical or chronological models most of the time; </li></ul><ul><li>Assumptions on the future value of the inputs leading to: </li></ul><ul><ul><li>Single forecasted value (base case?); </li></ul></ul><ul><ul><li>Scenario based forecast. </li></ul></ul><ul><li>"Post-processing" of the model outputs by the experts and/or the management; </li></ul>1950 1960 1970 1980 1990 2000 2010 2020 Year Passengers (M) Base case High case Low case Historical traffic
6. Drawbacks of the "classical" forecasting approach <ul><li>The "cheating/forgery" risk: </li></ul><ul><ul><li>"political" figures decided by the management to be "scientifically" justified by the forecasting team; </li></ul></ul><ul><ul><li>experts eager to be as much consensual as possible with the rest of the community: better to be wrong together than right alone! </li></ul></ul><ul><li>It ends up with self deception in the company </li></ul><ul><li>The no ending " what if … " questions asked by a management afraid of having to make up a decision; </li></ul><ul><li>The forecasting team implicitly deciding what is the level of risk the company should incur ; </li></ul><ul><li>A single figure or even scenario related figures does not make any sense from a mathematical and statistical point of view. </li></ul>
7. 2 generic sources of uncertainty in any forecast <ul><li>The intrinsic technical uncertainty : </li></ul><ul><ul><li>Assumptions on the future value of the inputs </li></ul></ul><ul><ul><li>(GDP, population, fares, …); </li></ul></ul><ul><ul><li>The very nature of the forecasting model </li></ul></ul><ul><ul><li>(linear law, exponentiation law, log law, …); </li></ul></ul><ul><ul><li>The uncertainty on the value of the parameters of the forecasting models; </li></ul></ul><ul><ul><li>The residuals: the difference between actual values and estimates. </li></ul></ul><ul><li>The prospective uncertainty ; any "abnormal" event which may happen in the future. </li></ul>The techniques developed by ADP's R&D team address mostly the 1st type of generic uncertainty: The intrinsic technical uncertainty
8. How to cope with the intrinsic technical uncertainty
9. What is the output we are looking for … <ul><li>The theory of probabilities provides the tools to answer most of the issues raised by the measurement of the present and the future uncertainty: </li></ul>… how to proceed? Dummy data
10. Let's go back to the very basics <ul><li>The full story always starts with a cloud of dots out of which one should find one or several laws/models to be further used as forecasting model(s): </li></ul>Actual data
11. Step #1: model determination <ul><li>1 or several models can fit the data. The way the models are determined is not important (econometrical models, behavioural models, etc.) </li></ul>Unless one has precise reason to select a specific model, there is no reasons to keep just one of them and to discard all the others. Each model is given an equal chance. R&D works under process to address this issue: the ADN engine for Alexander’s Drift Net. Actual data Actual data 1 st model Actual data 1 st model 2 nd model Actual data 1 st model 2 nd model 3 rd model
12. Step #2: determination of the law of probability of the models parameters <ul><li>Let's take the 1 st model for instance. </li></ul><ul><li>It's equation is: </li></ul><ul><li>where X is the residual </li></ul><ul><li>Bootstrap techniques allow to determine the laws of probability of the different parameters ( , , , ) of the model which are strongly correlated to each others. </li></ul>0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 17.5 18.0 18.5 19.0 19.5 20.0 20.5 Probability 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 Probability Example of drawings of random samples of the model parameters
13. Step #3: determination of the law of probability of the models output: Y <ul><li>At this stage we have all the probabilistic components of the forecasting model. That's where the Monté-Carlo techniques proves to be useful: </li></ul><ul><ul><li>Take a future deterministic or sampled value of X; </li></ul></ul><ul><ul><li>Draw a random sample of the model parameters; </li></ul></ul><ul><ul><li>Compute the corresponding value of Y; </li></ul></ul><ul><ul><li>Save the value of Y; </li></ul></ul><ul><ul><li>Start the process again until a sufficient number of Ys has been collected; </li></ul></ul><ul><ul><li>Compute the frequency/probability law of Y; </li></ul></ul>X Axis Y axis 98% probability for Y to be within the band Actual data 50% probability for Y to be greater or equal Forecasting model #2
14. Step #4: determination of the law of probability of the future values <ul><li>At this stage of the process we have all the probabilistic future values of each forecasting model. </li></ul><ul><li>That where the Monté-Carlo techniques is used once again to combine all these values and get the final probabilistic forecast. </li></ul><ul><li>Each model is given an equal probability to occur. </li></ul>X axis Y axis 98% probability for Y to be within the band Actual data 50% probability for Y to be greater or equal
15. The data aggregation / break-up issue
16. The data aggregation issue <ul><li>Let's suppose that we are interested in the forecasted demand of the French residents which depends on the French GDP. </li></ul><ul><li>For a given value of the French GDP, we can calculate a forecasted demand to/from UK, to/from the USA, to/from Japan, etc… It means that, from a statistical point of view, the different flows of traffic from/to France cannot be regarded as being independent variables. </li></ul><ul><li>Straightforward application of the Monté-Carlo technique would mix around all the random samples along the computation process as if they were fully independent which they are not. </li></ul>
17. The data agregation issue (continued) <ul><li>This problem can be overcome by "flagging" each value of the explanatory variables (i.e. French GDP, British GDP, etc.) and to "stick" the flag(s) value to the intermediate or final random samples which are sharing the same value of the explanatory variable(s). </li></ul><ul><li>Instead of "mixing around" all the data set, the Monté-Carlo engine just "mixes around" the random samples which are sharing the same flag. </li></ul>
18. The data break-up issue (continued) <ul><li>Let's suppose that the overall business level of risk as been set to 80% of probability for the overall demand to be greater or equal for instance. How does it cascade down? What is the corresponding level of risk of each traffic flow? </li></ul><ul><li>One should bare in mind that, unfortunately, 1+1 2 when dealing with probabilities; 1 + 1 could make 1.9! </li></ul><ul><li>Flagging the random samples of each traffic flow is one of the solutions to trace back which ones have been used in the final computation. </li></ul>Cumulated distribution of probabilities Overall demand 100% 80% 0% Set of samples to be discarded Demand of the traffic flow # i 100% 74% 0% Set of samples to be elected Cumulated distribution of probabilities Frequency law of the elected samples
19. Part of the prospective uncertainty: the residual issue
20. Taking into account part of the prospective risk <ul><li>A very simple and straightforward idea: </li></ul><ul><ul><li>Determination of the law of probability of the residuals. </li></ul></ul><ul><ul><li>Addition of the residual effects to the "regular" probabilistic forecast which can be achieved with a new round of Monté-Carlo simulations. </li></ul></ul><ul><li>By doing so we can take into account part of the prospective risks: i.e. the risks linked to "unusual" events which already happened in the past and may happen again . </li></ul><ul><li>Of course there is no statistical or probabilistic methods to estimate the effects of future events which never happened yet; that where scenario based approaches can be brought back to the front stage. </li></ul><ul><li>This approach answers the amplitude and the likelihood question of the 'unusual" events. It does not answer the when and how long questions: it just measures a "latent risk". </li></ul>
21. Taking into account part of the prospective risk (continued) There is ground here for the development of specific financial / management / industrial tools and policies to cover part of this latent risk 0% 5% 10% 15% 20% 25% -25% -20% -15% -10% -5% 0% 5% 10% 15% Residuals (% of total pax) Probability Probability distribution of the residuals 0 2 4 6 8 10 12 14 16 18 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand (M pax) Actual traffic data 50% probability for the demand to be greater or equal No residuals 98% probability range No residuals 0 2 4 6 8 10 12 14 16 18 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand (M pax) Actual traffic data 50% probability for the demand to be greater or equal Residuals included 98% probability range Residuals included 0 2 4 6 8 10 12 14 16 18 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand (M pax) Actual traffic data 50% probability for the demand to be greater or equal No residuals 98% probability range No residuals 0 2 4 6 8 10 12 14 16 18 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand (M pax) Actual traffic data 50% probability for the demand to be greater or equal Residuals included 98% probability range Residuals included 0 2 4 6 8 10 12 14 16 18 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand (M pax) Actual traffic data 50% probability for the demand to be greater or equal No residuals 98% probability range No residuals 50% probability for the demand to be greater or equal Residuals included 98% probability range Residuals included
22. Further developments and applications
23. Vertical cuts for most of the short term utilisation Turnover (million €) Probability for the turnover to be greater or equal Capacity threshold Operational Profit (million €) Probability for the operating profit to be greater or equal Capacity threshold € O million etc. <ul><li>To be used for: </li></ul><ul><li>(human) Resources dimensioning </li></ul><ul><li>Budget, cash flow </li></ul><ul><li>Future financial ratios analysis </li></ul><ul><li>Short term risk assessment </li></ul><ul><li>etc. </li></ul>1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand Actual capacity Demand/traffic (million pax) Probability for the demand to be greater or equal Capacity threshold
24. Horizontal cuts for most of the mid & long term utilisation 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 2020 2022 2024 Traffic/demand Actual capacity etc. To be mostly used for optimal dimensioning and planning of mid and long term capacity growth: heavy investments Planned capacity Annual 50% probability - actual capacity 50% probability - planned capacity 98% centred probability - actual capacity 98% centred probability - planned capacity Year 0 Operating profit
25. Conclusions
26. So many advantages, so few drawbacks <ul><li>A quite simple idea, but a rather complex and computer time consuming approach; </li></ul><ul><li>Put an end to the times when the forecasters were regarded as being fortune-tellers, gurus, devious crooks or scientific alibis for their boss misbehaviour (theirs of their boss' boss too); </li></ul><ul><li>Bring back the risk taking decision where it should have always been: the top management. In addition it offers the exhaustive set of data required by risk assessment tools; </li></ul><ul><li>Likely to offer a better legal protection to the forecasters in case of litigation with the share-holders or the financial markets; </li></ul><ul><li>Our own experience is that bankers are found of this way of making forecast. Aren't they mostly risk traders! </li></ul><ul><li>We (the ADP's forecasting team) are found of it too, since it saves us a lot of forecasting post-processing time while having no more pressures put on us for finding "convenient figures". </li></ul>
Be the first to comment