Dan SteinbergJanuary 2013Salford Systemswww.salford-systems.com
   While TreeNet (Stochastic Gradient Boosting) can work phenomenally well out    of the box it almost always pays to try...
   TreeNet starts with 200 trees by default,    although you can reset default.   In real-world modeling we often find t...
   This one goes hand in hand with growing    enough trees because the slower your learn    rate is, the more trees you w...
   The default value of 0.10 means that 10% of    the data could be ignored in each training    cycle.   You ought to ex...
   If 500 trees are needed when you generate 6    node trees, you might need 1500 or more    when generating just 2-node ...
   Try Battery LOVO (leave one variable out) as this    might allow you to remove a variable from the    middle of the pa...
   First, run some completely additive models.    Unlike 2-node trees that can actually allow    interactions due to the ...
   Then, in the PRO EX version, you can run the    BATTERY ADDITIVE procedure which will start    with a fully flexible m...
   For more on TreeNet, visithttp://www.salford-systems.com/en/products/treenet                            © Copyright Sa...
Upcoming SlideShare
Loading in...5
×

6 Tips for Optimizing TreeNet Gradient Boosting Models

1,696

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,696
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

6 Tips for Optimizing TreeNet Gradient Boosting Models

  1. 1. Dan SteinbergJanuary 2013Salford Systemswww.salford-systems.com
  2. 2.  While TreeNet (Stochastic Gradient Boosting) can work phenomenally well out of the box it almost always pays to try to tune your control parameters. Devoting time to optimizing a TreeNet model can improve its out of sample performance noticeably. Here is a list of several things recommended for all TreeNet users. © Copyright Salford Systems 2013
  3. 3.  TreeNet starts with 200 trees by default, although you can reset default. In real-world modeling we often find that 1,000 or more trees perform better. © Copyright Salford Systems 2013
  4. 4.  This one goes hand in hand with growing enough trees because the slower your learn rate is, the more trees you will need. There is nothing wrong with using a learn rate of .001 if you are willing to let your machine run through all the trees you will need. © Copyright Salford Systems 2013
  5. 5.  The default value of 0.10 means that 10% of the data could be ignored in each training cycle. You ought to experiment with a value of 0.0 to see if it helps or hurts. You can also try values such as 0.02, 0.05 etc. Note: If the data are very clean 0.0 should work best. © Copyright Salford Systems 2013
  6. 6.  If 500 trees are needed when you generate 6 node trees, you might need 1500 or more when generating just 2-node trees. Sometimes moderately large trees work best: 12-node, 15-node, even 25-node trees could do the trick. Since large trees learn more than smaller trees, you might also need to dial down the learn rate to prevent over-fitting. © Copyright Salford Systems 2013
  7. 7.  Try Battery LOVO (leave one variable out) as this might allow you to remove a variable from the middle of the pack in terms of importance. Try Battery SHAVING to remove the least important variables (shaving from the bottom of the list). This tests the viability of dropping the "best" variables © Copyright Salford Systems 2013
  8. 8.  First, run some completely additive models. Unlike 2-node trees that can actually allow interactions due to the manner in which TreeNet handles missing values. With the ICL ADDITIVE command you guarantee no possible interactions of any kind, including interactions between missing value indicators created by TreeNet and other variables. © Copyright Salford Systems 2013
  9. 9.  Then, in the PRO EX version, you can run the BATTERY ADDITIVE procedure which will start with a fully flexible model and search for the one variable which can most readily be made additive (interact with nothing). Then it searches for a second variable to be made additive, and so on, going step by step until all variables are additive. Reviewing the performance curve of this procedure allows the discovery of the optimal balance between full free interactivity and limited interactivity. If a variable or variables really do not interact with any others then preventing chance interactions from creeping into the model will improve the model on future unseen data. © Copyright Salford Systems 2013
  10. 10.  For more on TreeNet, visithttp://www.salford-systems.com/en/products/treenet © Copyright Salford Systems 2013
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×