1. The document proposes using random forest (RF), a machine learning technique, to forecast euro area GDP based on large survey datasets.
2. An out-of-sample exercise from 2004-2009 finds that a pure RF model outperforms an AR model but not official European Commission forecasts. However, during the financial crisis, RF performs poorly due to non-negative values in the training data.
3. A modified two-step RF-linear model approach selects important variables from RF and chooses the best linear model, improving crisis period performance and comparing well to official forecasts overall.
I conti economici trimestrali: avanzamenti metodologici e prospettive di innovazione
Seminario
Roma, 21 aprile 2016
Istat, Aula Magna
Via Cesare Balbo, 14
Estimating Financial Frictions under LearningGRAPE
The paper studies the implication of initial beliefs and associated confidence under adaptive learning. We first illustrate how prior beliefs determine learning dynamics and the evolution of endogenous variables in a small DSGE model with credit-constrained agents, in which rational expectations are replaced by constant-gain adaptive learning. We then examine how discretionary experimenting with new macroeconomic policies is affected by expectations that agents have in relation to these policies. More specifically, we show that a newly introduced macro-prudential policy that aims at making leverage counter-cyclical can lead to substantial increase in fluctuations under learning, when the economy is hit by financial shocks, if beliefs reflect imperfect information about the policy experiment.
The dangers of policy experiments Initial beliefs under adaptive learningGRAPE
The paper studies the implication of initial beliefs and associated confidence on the system’s
dynamics under adaptive learning. We first illustrate how prior beliefs determine learning dynamics
and the evolution of endogenous variables in a small DSGE model with credit-constrained agents,
in which rational expectations are replaced by constant-gain adaptive learning. We then examine
how discretionary experimenting with new macroeconomic policies is affected by expectations that
agents have in relation to these policies. More specifically, we show that a newly introduced macroprudential policy that aims at making leverage counter-cyclical can lead to substantial increase in
fluctuations under learning, when the economy is hit by financial shocks, if beliefs reflect imperfect
information about the policy experiment. This is in the stark contrast to the effects of such policy
under rational expectations.
Entropy and systemic risk measures
M. Billio, R. Casarin, M. Costola, A. Pasqualini
Ca’ Foscari Venice University
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
I conti economici trimestrali: avanzamenti metodologici e prospettive di innovazione
Seminario
Roma, 21 aprile 2016
Istat, Aula Magna
Via Cesare Balbo, 14
Estimating Financial Frictions under LearningGRAPE
The paper studies the implication of initial beliefs and associated confidence under adaptive learning. We first illustrate how prior beliefs determine learning dynamics and the evolution of endogenous variables in a small DSGE model with credit-constrained agents, in which rational expectations are replaced by constant-gain adaptive learning. We then examine how discretionary experimenting with new macroeconomic policies is affected by expectations that agents have in relation to these policies. More specifically, we show that a newly introduced macro-prudential policy that aims at making leverage counter-cyclical can lead to substantial increase in fluctuations under learning, when the economy is hit by financial shocks, if beliefs reflect imperfect information about the policy experiment.
The dangers of policy experiments Initial beliefs under adaptive learningGRAPE
The paper studies the implication of initial beliefs and associated confidence on the system’s
dynamics under adaptive learning. We first illustrate how prior beliefs determine learning dynamics
and the evolution of endogenous variables in a small DSGE model with credit-constrained agents,
in which rational expectations are replaced by constant-gain adaptive learning. We then examine
how discretionary experimenting with new macroeconomic policies is affected by expectations that
agents have in relation to these policies. More specifically, we show that a newly introduced macroprudential policy that aims at making leverage counter-cyclical can lead to substantial increase in
fluctuations under learning, when the economy is hit by financial shocks, if beliefs reflect imperfect
information about the policy experiment. This is in the stark contrast to the effects of such policy
under rational expectations.
Entropy and systemic risk measures
M. Billio, R. Casarin, M. Costola, A. Pasqualini
Ca’ Foscari Venice University
Final SYRTO Conference - Université Paris1 Panthéon-Sorbonne
February 19, 2016
This paper presents an extension of JDemetra+ that can be used to operationalize the process of nowcasting: 1) short term forecasting and 2) reading the newsflow. The second point is not possible with "partial" models such as bridge equations or any kind of univariate regressions. Our multivariate modeling approach, inspired by the state-of-the-art literature on nowcasting, is parsimonious and estimation is feasible even in the presence of large and heterogeneous data sets. By taking into account the calendar of macro-economic releases, it can also provide a realistic measure of the forecasting accuracy and how it decreases over time as more and more information becomes available. An example based on well known US variables illustrates the power of this methodology and the usefulness of our visualization approach. Further examples can be found in our wiki (https://github.com/nbbrd/jdemetra-nowcasting/wiki)
We show how JDemetra+ can be used for monitoring the German business cycle and visualizing the real-time dataflow, which contributes to automatically update our perception of events. The formalization of the nowcasting problem is not specific to the use of dynamic factor models. It allows the forecaster to take into account the timeliness and quality of the various data releases in the process monitoring the economy in real time.
Modelling traffic flows with gravity models and mobile phone large dataUniversity of Salerno
The analysis of origin-destination traffic flows is useful in many contexts of application
as urban planning and tourism economics, and have been commonly studied through the
Gravity Model, which in its simplest formulation states that flows are proportional to masses
of both origin and destination and inversely proportional to distance between them. Using data
from the flow of mobile phone signals among different areas recorded on hourly basis for several
months, in this study we use the Gravity Model to characterize the dynamic of such flows
over the time in the strongly urbanized and flood-prone area of the Mandolossa (western outskirts
of Brescia, northern Italy), with the final aim of predicting the traffic flow during flood
episodes. In order to better account for the dynamic of flows over time, we introduce in the
model a most accurate set of explanatory variables: (i) the density of mobile phone users by
area and time period and (ii) some appropriate temporal effects. Preliminary results show that
the joint use of these two novel sets of explanatory variables allow us to obtain a better linear
fitting of the Gravity Model and a better traffic flow prediction for the flood risk evaluation.
The consistency in aggregation (CIA) approach newly developed (Mehrhoff, 2010, Jahr. Nationalökon. Statist.) fills the void of guiding the choice of the elementary index (for which weights are not available) that corresponds to the characteristics of the index at the second stage (where weights are actually available). The presentation illustrates the approach, its motivation and its background, some empirical results and finally it synthesizes some issues of discussion.
http://www.istat.it/en/archive/168897
http://www.istat.it/it/archivio/168890
This paper presents an extension of JDemetra+ that can be used to operationalize the process of nowcasting: 1) short term forecasting and 2) reading the newsflow. The second point is not possible with "partial" models such as bridge equations or any kind of univariate regressions. Our multivariate modeling approach, inspired by the state-of-the-art literature on nowcasting, is parsimonious and estimation is feasible even in the presence of large and heterogeneous data sets. By taking into account the calendar of macro-economic releases, it can also provide a realistic measure of the forecasting accuracy and how it decreases over time as more and more information becomes available. An example based on well known US variables illustrates the power of this methodology and the usefulness of our visualization approach. Further examples can be found in our wiki (https://github.com/nbbrd/jdemetra-nowcasting/wiki)
We show how JDemetra+ can be used for monitoring the German business cycle and visualizing the real-time dataflow, which contributes to automatically update our perception of events. The formalization of the nowcasting problem is not specific to the use of dynamic factor models. It allows the forecaster to take into account the timeliness and quality of the various data releases in the process monitoring the economy in real time.
Modelling traffic flows with gravity models and mobile phone large dataUniversity of Salerno
The analysis of origin-destination traffic flows is useful in many contexts of application
as urban planning and tourism economics, and have been commonly studied through the
Gravity Model, which in its simplest formulation states that flows are proportional to masses
of both origin and destination and inversely proportional to distance between them. Using data
from the flow of mobile phone signals among different areas recorded on hourly basis for several
months, in this study we use the Gravity Model to characterize the dynamic of such flows
over the time in the strongly urbanized and flood-prone area of the Mandolossa (western outskirts
of Brescia, northern Italy), with the final aim of predicting the traffic flow during flood
episodes. In order to better account for the dynamic of flows over time, we introduce in the
model a most accurate set of explanatory variables: (i) the density of mobile phone users by
area and time period and (ii) some appropriate temporal effects. Preliminary results show that
the joint use of these two novel sets of explanatory variables allow us to obtain a better linear
fitting of the Gravity Model and a better traffic flow prediction for the flood risk evaluation.
The consistency in aggregation (CIA) approach newly developed (Mehrhoff, 2010, Jahr. Nationalökon. Statist.) fills the void of guiding the choice of the elementary index (for which weights are not available) that corresponds to the characteristics of the index at the second stage (where weights are actually available). The presentation illustrates the approach, its motivation and its background, some empirical results and finally it synthesizes some issues of discussion.
http://www.istat.it/en/archive/168897
http://www.istat.it/it/archivio/168890
The aim of the article is to analyse labour productivity key indicators of manufacturing or working efficiency of European Union (EU), it the theoretical bases and the regularities of these changes. We use regression analysis. Knowledge of the regularities of labour productivity changes allows predicting future changes and make optimal business decisions. The basis is gross domestic product (GDP) analysis. We will analyse labour productivity by turnover and gross value added per person employed of manufacturing total and partly by countries, but also GDP per capita. Taking the basis this publication and the previous works of the authors, draws conclusions and suggestions.
Predicting the economic public opinions in EuropeSYRTO Project
Predicting the economic public opinions in Europe
Maurizio Carpita, Enrico Ciavolino, Mariangela Nitti
University of Brescia & University of Salento
SYRTO Project Final Conference, Paris – February 19, 2016
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...ijaia
We present how artificial intelligence can be used to optimize countries' macroeconomic and environmental programs for a given period. We use an automaton that manages possible changes to a country’s membership of country unions, an Expert System based on macroeconomic and environmental rules, and an optimizer of rules, scenarios, and programs. This approach can be applied to any country by using its historical data and by quantifying parameters suitable for that country: name of the country, population, cash, situation in relation to country’s unions, constraints (in particular limit values that must be respected by the programs), and macroeconomic and environmental rules parameters. As example, we apply the presented process to examples of France’ programs. We put forward optimizations of four macroeconomic and environmental scenarios, and seven macroeconomic and environmental programs for France from 2022–2026 in line with different objectives. We then quantify the significant improvements obtained with their optimizations
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTINGSPICEGODDESS
What Is Time Series Regression? Time series regression is a statistical method for predicting a future response based on the response history (known as autoregressive dynamics) and the transfer of dynamics from relevant predictors.
The object of this study is to compare the economic structure through the intersectoral structure of the two economies, Vietnam and China. This research shows differences in levels of economic structure and induced impacts on output, value added, import, energy requirement and emissions of dioxin carbon in producing a final product unit. It also gives an overall picture of economy to help policy makers to make the best decisions for the economy and the environment. we find that on the surface, it seems that the economic structure of Vietnam and China have a lot of similarities, but when looking deeper into the economy, it shows diference between two economies, the economic structure of China has intersectoral linkage indexs better than these indexs of Vietnam, but Manufacturing industry in China has a high level of CO2 emission, which shows that the current technology cannot suffice for waste disposal. It also recommends that the countries attracting FDI from China should have a strict check for waste treatment.
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...ijmpict
Based on the BDM (Bit Decision Making) method, the present work presents two contributions: first, the
illustration of the use of the technique known as SOP (Sum Of Products) in order to systematize the
process to obtain the correlation function for sub-system’s mathematical modelling, and second,the provision of capacity to manage a greater than binary but a finite - discrete set of possible subjective qualifications of suppliers at any criterion.
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...Soledad Zignago
Gilbert Cette's slides at the Secular Stagnation and Growth Measurement Conference, Banque de France, January 16, 2017, with Antonin Bergeaud & Remy Lecat https://www.banque-france.fr/stagnation-seculaire-et-mesure-de-la-croissance-conference-organisee-par-la-banque-de-france-et-le
Similar to Présentation Olivier Biau Random forests et conjoncture (20)
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
"Impact of front-end architecture on development cost", Viktor Turskyi
Présentation Olivier Biau Random forests et conjoncture
1. Euro area GDP forecasting using large survey datasets A random forest approach Olivier Biau – Angela D’Elia Directorate General for Economic and Financial Affairs European Commission Groupe Travail prévision Paris, 12 May 2010 Views expressed represent exclusively the positions of the authors and do not necessarily correspond to those of the European Commission
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Editor's Notes
In recent years there has been increasing interest in forecasting methods that utilise large data sets . Indeed, there is a huge quantity of information available in the economic arena which might be useful for forecasting, but standard econometric techniques are not well suited to extract this in a useful form. This is not only an issue of academic interest . Central bankers and policy makers are interested in summarising large data sets for forecasting purposes (Eklund and Kapetanios gave a wide review of the recent literature on this issue) Since the reference paper of Stock and Watson (2002), factors methods have been at the fore front of developments in forecasting with large data set. The methodology to extract a few number of common factors has become more and more sophisticated from principal components to dynamic factor models, for example in Forni et al. Or to deal with unbalanced data sets at the end of the sample (jagged edge feature of the data), like in the recent contribution from Giannone on real time nowcast Finally, factor analysis combined with linear models (bridge models) has been among the main tool to summarise large data set for forecasting purpose.
In our study, we would like to present a new statistical approach to forecasting macro economic aggregates based on the RF techniques , proposed by Breiman in the 2000’s. This technique is widely used in biostatistics, becomes more and more popular and appears to be very powerful in a lot of different applications (classification problems and also regression problems). But, it is largely unknown in economics . To our knowledge, the only application in the economic field is the paper I wrote with my co-authosr (gérard Biau ans Laurent Rouvière) when I was working for the French INSEE. In this paper, we use the micro data from the French Industry Business Tendency Survey to track the manufacturing output.
If RF are so widely used in the medical research, it is because it enjoys good prediction properties, is robust to noise and can handle a very large number of input variables (which is very often the case in medicine, as you can collect a lot of variable on your patients but very often the number of observation is limited. RF is considered to be one of the most accurate general-purpose learning techniques available, independent of any functional and distributional assumptions RF is considered to be very powerful… but it is not clearly elucidated from a mathematical point of view . Although the mechanism appears simple, it involves many different driving forces which make it difficult to analyse. In fact, its mathematical properties remain to date largely unknown and, up to now, most theoretical studies have concentrated on isolated parts or stylized versions of the algorithm. Lin and Jeon: connection with the adaptative nearest neighbour The most recent paper from Gérard Biau is a step frowards as it shows consistency theorem of the RF algorithm.
After this introduction, here the outline of the presentation: I will first present the dataset In order to explain the algorithm I will take one basic example Then, I will present the benchmarks, I mean, the competitors of the Random forest outputs Finally, we will see our results and the perspectives of this work
After this introduction, here the outline of the presentation: I will first present the dataset In order to explain the algorithm I will take one basic example Then, I will present the benchmarks, I mean, the competitors of the Random forest outputs Finally, we will see our results and the perspectives of this work
Well, the data set used in this paper is based on the J H EU Programme of BCS . It covers 5 sectors: A key aspect of the business surveys is that most questions ask for qualitative responses, reflecting the sentiment or confidence (optimistic, pessimistic or neutral) of managers and consumers The Programme covers all the 27 Members States, Croatia, the FYROM and Turkey More than 125 000 firms and over 40 000 consumers are surveyed every month
To be more precise, The dataset mainly consists of the euro area balances of opinion (%positive - % negative) The time series used in the analysis are those available at the end of the third month of each quarter: the level series: monthly St or quarterly Sq , the difference series ( St - St-1 , St - St-2 , St - St-3 for monthly questions, Sq - Sq-1 for quarterly questions) The dataset is composed of p = 172 ‘soft’ series: X i The only ‘hard’ variable is the euro area GDP qoq growth series: Y i Finally, we have –what we called in this literature- a « learning set » L = { ( X 1 , Y 1 ) … ( X n , Y n ) } n=57 (from 3 rd Q 1995 to 3 rd Q 2009)
I will now speak about the RF. But, before dealing with the Forest, I have to explain what is a tree and, above all, how it is grown . What is a tree? It is a partition of space into regions. The most important is the binary tree, since they have just 2 children per node Here we have an example: We first split the space into two regions. One or both of these regions are split into two more regions, and this process is continued, until some stopping rule is applied How to grow a tree (the response is for example with CART) and what is the tree predictor (or tree regressor) . In my next slide, I will take a basic example…
Imagine I want to predict the income of someone who enters this room using the information I have collected about your self: So I know about you a lot of characteristics Xi such as your gender, size, age, your position –I mean do you have responsibility or not, are you head of unit, etc…), …and of course I have collected your income Y. We have to find the first node of the tree, that is the first splitting variable X j and the first split point s which discriminate the most (by solving this expression if we want to model the response forexample by a constant in each region) If we take the variable ‘size’, N1 represents the small ones (smaller than I) and N2 the tall ones. C1 is the average income for the small ones and C2 the average income of the all one. Do you think, this would be the best partition in term of minimum sum of square ? Undoubtedly the answer would be NO. The first node will be the position (the chiefs / the others…) or the age. So having found the first split by scanning all the covariates, CART repeats the process with other variables until for example the terminal nodes contains a user specified number of individuals.
Now, the tree is grow… Given the characteristics X of the new entering the room (gender, size, position, …) I put him down the tree, He falls in a terminal node N(X) And I predict his income by averaging the observed Yi over the observations i ‘falling’ in that node. For example, is the new entering this room is a young Man, he is 150cm high, he is French and not HoU, … I will predict his income by averaging those of the young small French Men who are not chiefs.
Breiman has demonstrated that consequential gains in prediction accuracy can be achieve by using a set of simpler trees. Here we have the RF algorithm as described in the book of Hastie More precisely, a random forest is a collection of K tree predictors, Where each tree is constructed from a bootstrap sample from the learning dataset L However, instead of determining the optimal split on a given node by evaluating all possible covariates, a subset of covariates drawn at random is used. Finally, we aggregate K tree predictors of simpler size trees. By doing that, it is possible to approximate a rich class of functions. And it gives an accurate approximation of the conditional mean. For the free parameters K , nodesize and mtry (the variables chosen at random from p) , we used the default values 500, 5 and p /3 of the random forest R-package
In addition, RF posses a number of features, that can be used for example to deal with the problem of missing values. In this study, we use the important feature to reduce data dimensionality . Here n the number of observations is much lower than p the dimension of the explanatory variables The variable importance allows to discriminate between informative and noninformative variables. For each variable, the idea is to compare the prediction error with the prediction error where the variable is randomly permuted - Large positive values for a variable indicate that this variable is predictive, since noising it up increases prediction error, - zero or negative importance values indicate non-predictive variables In my previous example, the importance for size would have been very likely negative, while age or position would have been positive…
So, our aim is to predict GDP growth for quarter Q, based on data available at the end on quarter Q To be clear, based on the information available until end of June 2010, we want to forecast 2 nd Q of 2010 (now cast, remember that Flash estimate GDP will be release by Eurostat mid august) To asses the quality of our forecast, we will do an out-of-sample analysis: 2004q1 – 2009q3 The criterium will be the MSE (out of sample MSE) We started our study an unvariate AR model … which appear to be a poor competitor. And we decided to compared our results with a fair competitor: the quarterly projections of the EZEO (jointly release by three major European economic institutes, the German IFO, the French Insee and the Italian ISAE) EZEO is a quarterly publication Based on the information available at the end of the previous quarter, EZEO forecast GDP growth of quarter Q. In fact, this publication provides also 2-steps -ahead projections (Q+1 and Q+2) for GDP, IP, Consumption, Inflation And describes also economic links explaining these forecasts… Here, however, our concern was just to asses How a data-driven model like the RF perform relative to competitors for GDP nowcasting.
Results: We compare the pure random forest (non parametric) to our benchmarks. Based on MSE computed on the whole out of sample period, RF outperforms the AR but not the euro zone economic outlook You have quarter after quarter all the forecast in the paper, but in this graph I plotted how MSE develop over time (real time MSE) The graphs highlights the good performance of RF before the crisis Poor performance during the crisis… in fact for 2008Q4 and 2009Q1, we made a huge error. But it is not surprising, as the pure RF are based on average of values observed in the learning set. And before the crisis, no negative vales were present in the learning set ! However, it is worth noting that the RF algorithm “learns” and will be able to predict negative output in the future.
To square the problem of non negative values in our learning set, we try to use the power of RF to select the 25 most important variables And then to insert this variable in a bridge model. We use the General to Specific procedure to choose the best linear model. RF_LIMOD is the model we retained. Remarkably, 3 of five explanatory variables come from the Industry (whose variability explain most of the GDP cycles), moreover the variable are related to orders book, which are well known to be among the most factual and reliable soft variable We have also one question about order in retail trader and on question from the consumer survey.
To square the problem of non negative values in our learning set, we try to use the power of RF to select the 25 most important variables And then to insert this variable in a bridge model. We use the General to Specific procedure to choose the best linear model. RF_LIMOD is the model we retained. Remarkably, 3 of five explanatory variables come from the Industry (whose variability explain most of the GDP cycles), moreover the variable are related to orders book, which are well known to be among the most factual and reliable soft variable We have also one question about order in retail trader and on question from the consumer survey.
Results: We continue compare RF_LINMOD outputs using the same criterion Based on MSE computed on the whole out of sample period, it performs as well as the euro zone economic outlook outperforms the pure RF and compares well to the euro zone economic outlook during the ‘crisis’
For further development, we would like continue this analysis by adding in our dataset the hard variable that are available at the end of each quarter (for example the carry over of industrial production, first registration of private cars)… and according to our tests, the carry over of the IP appears to be one of the most important variable! Moreover, this kind of information is for sure used by our colleagues in the euro zone economic outlook… So, this work is still ongoing but promising.