Deux présentations pour le jeudi 12 mai :
Date : jeudi 12/05/2011
Lieu :
ENGREF
19 avenue du Maine
75732 PARIS
Métro : Montparnasse, Falguière
Salle 208
########################################
"Random forests / Forêts aléatoires"
Gérard Biau (Université Pierre et Marie Curie)
et
"Euro area GDP forecasting using large survey datasets: a Random Forest approach"
Olivier Biau (Commission Européenne)
#####Lieu
#####Résumés
Titre : Random forests / Forêts aléatoires
Résumé : Random forests are a scheme proposed by Leo Breiman in the 00's for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data. Despite growing interest and practical use, there has been little exploration of the statistical properties of random forests, and little is known about the mathematical forces driving the algorithm. In this talk, we will discuss an in-depth analysis of a random forests model suggested by Breiman in 2004, which is very close to the original algorithm. We show in particular that the procedure is consistent and adapts to sparsity, in the sense that its rate of convergence depends only on the number of strong features and not on how many noise variables are present.
"Euro area GDP forecasting using large survey datasets: a Random Forest approach"
Résumé: This paper investigates the potential of applying the Random Forests technique to modelling and forecasting macro-economic aggregates using large datasets of survey variables, in the same vein as Biau, Biau and Rouvière (2007). A specific application for short-term GDP forecasting in the euroarea is shown using the harmonised European Union Business and Consumer Survey dataset. The Random Forests technique is explored with two aims in mind: the first is to obtain (through a Monte Carlo exercise) a preliminary non-parametric forecast of GDP growth, and the second is to analyse a number of candidate explanatory variables to distinguish between those which significantly contribute to explaining and predicting the analysed phenomenon and those which mostly add random noise. The forecast performance of this survey-based model is assessed with an out-of-sample exercise (using vintage data): the results are compared both with the outputs from an auto-regressive model (taken as benchmark) and with the quarterly projections of the euro zone economic outlook (jointly released by three major European economic institutes: the German IFO, the French INSEE and the Italian ISAE), which are deemed to be among the most reliable forecasts. Evidence is found that a well-performing and parsimonious survey-based model can be specified to forecast GDP quarter-on-quarter growth in the euro area, and that Random Forests is therefore an effective tool for selecting the most relevant predictive variables.