Successfully reported this slideshow.
Your SlideShare is downloading. ×

PyData Hamburg - Gradient Boosting

PyData Hamburg - Gradient Boosting

Download to read offline

Gradient Boosted Decision Trees are a price winning machine learning model that can be used for classification, prediction and ranking tasks. This talk will give a theoretical introduction to Decision Trees and Gradient Boosting and give examples from practical applications.

Gradient Boosted Decision Trees are a price winning machine learning model that can be used for classification, prediction and ranking tasks. This talk will give a theoretical introduction to Decision Trees and Gradient Boosting and give examples from practical applications.

More Related Content

PyData Hamburg - Gradient Boosting

  1. 1. A Deep Dive Into Gradient Boosting Daniel Kohlsdorf
  2. 2. 2 Why Gradient Boosting?
  3. 3. 3 1. Outlier Filtering For Job Recommendations 2. Reranking Job Recommendations 3. Classifying Profiles: 1. Willingness To Change Job 2. Discipline Since Then …
  4. 4. 4 Decision Trees, Gradient Boosting and Application(s)
  5. 5. 5 Decision Trees
  6. 6. 6 Decision Trees
  7. 7. 7 Goal 1. Partition input space 2. Pure class distribution in each partition
  8. 8. 8 Decision Trees: Guillotine cuts
  9. 9. 9 Decision Trees: Guillotine cuts
  10. 10. 10 Decision Trees: Guillotine cuts
  11. 11. 11 Finding The Best Split age 0 12,5 25 37,5 50 Spongebob CrimeShow crime show fans sponge bob fans
  12. 12. Finding The Best Split crime show fans sponge bob fans age 0 25 50 75 100 Spongebob CrimeShow 0 17,5 35 52,5 70 Spongebob CrimeShow
  13. 13. Finding The Best Split crime show fans sponge bob fans age 0 25 50 75 100 Spongebob CrimeShow 0 25 50 75 100 Spongebob CrimeShow
  14. 14. 14 • Choose the best split based on class distribution impurity • Common measure is entropy. • Choose the split that minimizes impurity the most: Finding the best split 0 12,5 25 37,5 50 Spongebob CrimeShow 0 25 50 75 100 Spongebob 0 25 50 75 100 CrimeShow
  15. 15. 15 Greedily Constructing A Decision Tree
  16. 16. 16 Greedily Constructing A Decision Tree
  17. 17. 17 Greedily Constructing A Decision Tree
  18. 18. 18 Greedily Constructing A Decision Tree
  19. 19. 19 Gradient Boosting One Tree Is Not Enough
  20. 20. 20 1. Weighted combination of weak learners 2. Prediction is based on comitee votes 3. Boosting: 1. Train ensemble one weak learner at the time 2. Focus new learners on wrongly predicted examples Ensemble Methods
  21. 21. 21 1. Learn a regressor 2. Compute the error residual (Gradient in deep learning) 3. Then build a new model to predict that residual Gradient Boosting
  22. 22. 22 Gradient Boosting Our model: For each datapoint return 0
  23. 23. 23 Gradient Boosting
  24. 24. 24 Gradient Boosting
  25. 25. 25 Building a Decision Tree from Gradients crime show fans sponge bob fans age We have a gradient for each training example Return pooled gradients instead of class
  26. 26. 26 Building a Decision Tree from Gradients crime show fans sponge bob fans age Impurity is the mangnitude of the pooled gradient
  27. 27. 27 1. RMSE [Prediction] 2. Sigmoid [Binary Classification] 3. Softmax [Multiclass Classification] 4. Ranking Loss [Ranking] Not just regression
  28. 28. 28 Gradient Boosting @Xing
  29. 29. 29 Outlier Filtering
  30. 30. 30
  31. 31. 31
  32. 32. 32 Thanks daniel.kohlsdorf@xing.com http://daniel-at-world.blogspot.com

×