Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Random forests with random projections of the 
output space for high dimensional multi-label 
classification 
Arnaud Joly,...
Multi-label classification tasks 
Many supervised learning applications in text, biology or image 
processing where sample...
Random forest 
Randomized trees are built on a bootstrap copy of the input-output 
pairs ((xi; yi) 2 (X  Y))ni 
=1 by recu...
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Random forests with random projections of the output space for high dimensional multi-label classification
Upcoming SlideShare
Loading in …5
×

Random forests with random projections of the output space for high dimensional multi-label classification

1,148 views

Published on

We adapt the idea of random projections applied to the out- put space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be reduced without affecting computational complexity and accuracy of predictions. We also show that random output space projections may be used in order to reach different bias-variance tradeoffs, over a broad panel of benchmark problems, and that this may lead to improved accuracy while reducing significantly the computational burden of the learning stage.

Link to the paper http://orbi.ulg.ac.be/handle/2268/172146
Souce code available at https://github.com/arjoly/random-output-trees

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Random forests with random projections of the output space for high dimensional multi-label classification

  1. 1. Random forests with random projections of the output space for high dimensional multi-label classification Arnaud Joly, Pierre Geurts, Louis Wehenkel
  2. 2. Multi-label classification tasks Many supervised learning applications in text, biology or image processing where samples are associated to sets of labels. Input X 800 600 pixel Output Y labels driver, mountain, road, car, tree, rock, line, human, . . . If each label corresponds to a wikipedia article, then we have around 4 million labels. 2 / 15
  3. 3. Random forest Randomized trees are built on a bootstrap copy of the input-output pairs ((xi; yi) 2 (X Y))ni =1 by recursively maximizing the reduction of impurity, here the variance Var. At each node, the best split is selected among k randomly selected features. S SL Xk tk Xk tk SR Var(S) = 0:24 Var(SL) = 0:014 Var(SR) = 0:1875 Var(S) = Var(S)

×