์ „์ฐฝ์šฑ
2018.04.24
Machine Learning
Ensemble Boosting
1. WHY
2. Ensemble
3. Bagging
4. Boosting
5. Stacking
6. Adaptive Boosting
7. GBM
8. XG Boost
9. Light GBM
๋ชฉ์ฐจ
XGBoost
XGBoost
XGBoost
XGBoost๋Š” eXtream Gradient Boosting์˜ ์•ฝ์ž
Gradient Boosting Algorithm
์ตœ๊ทผ Kaggle User์—๊ฒŒ ํฐ ์ธ๊ธฐ
What is XGBoost?
Ensemble
๋™์ผํ•œ ํ•™์Šต Algorithm์„ ์‚ฌ์šฉ ์—ฌ๋Ÿฌ Model ํ•™์Šต
Weak learner๋ฅผ ๊ฒฐํ•ฉํ•˜๋ฉด, Single learner๋ณด๋‹ค ๋‚˜์€ ์„ฑ๋Šฅ
Ensemble
RandomForestClassifier Ensemble ์ด๋„ค
ML์—์„œ๋Š” ์•„์ฃผ ์‰ฝ๊ฒŒ Ensemble์„ ์ง€์›ํ•˜๋„ค
์ปดํ“จํ„ฐ๊ฐ€ ๋งŽ๊ณ  ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฌด์กฐ๊ฑด ๋งŽ์•„์•ผ ํ•  ์ˆ˜ ์žˆ๋Š”๊ฒŒ ์•„๋‹ˆ๋„ค
Random Forest
Bagging and Boosting and Stacking
๋™์ผํ•œ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ Ensemble
์„œ๋กœ ๋‹ค๋ฅธ ๋ชจ๋ธ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ
Bagging
ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๋žœ๋ค์œผ๋กœ Sampling ํ•˜์—ฌ ์—ฌ๋Ÿฌ ๊ฐœ์˜ Bag์œผ๋กœ ๋ถ„ํ• ํ•˜๊ณ ,
๊ฐ Bag๋ณ„๋กœ ๋ชจ๋ธ์„ ํ•™์Šตํ•œ ํ›„, ๊ฐ ๊ฒฐ๊ณผ๋ฅผ ํ•ฉํ•˜์—ฌ ์ตœ์ข… ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœ
n: ์ „์ฒด ํ•™์Šต data ์ˆ˜
nโ€˜: bag์— ํฌํ•จ๋œ data ์ˆ˜, ์ „์ฒด data ์ค‘ sampling data
m: bag์˜ ๊ฐœ์ˆ˜, ํ•™์Šตํ•  ๋ชจ๋ธ๋ณ„๋กœ sampling data set
Bagging (Regression)
Low Bias
High Variance Low Variance
Bagging
๋Œ€ํ‘œ์ ์ธ Bagging Algorithm : Random Forest
Categorical Data : Voting
Continuous Data : Average
Boosting
๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌ
์ •ํ™•๋„ Outlier
Boosting
D1์—์„œ ์ž˜๋ชป ์˜ˆ์ธกํ•œ ์—๋Ÿฌ ์ถ”์ถœ
D2 ํ•™์Šต์‹œ ๊ฐ€์ค‘์น˜ ์ ์šฉ ํ›„ ํ•™์Šต
Boosting
Prediction ๊ฒฐ๊ณผ๊ฐ€ ์ž˜๋ชป ๋œ๊ฒƒ ๋‹ค์‹œ ์ „๋‹ฌ
Boosting
D2์—์„œ ์ž˜๋ชป ์˜ˆ์ธกํ•œ ์—๋Ÿฌ ์ถ”์ถœ
D3 ํ•™์Šต์‹œ ๊ฐ€์ค‘์น˜ ์ ์šฉ ํ›„ ํ•™์Šต
Stacking
Meta Modeling โ€œTwo heads are better than oneโ€
ํ•™์ƒ์„ ์ž˜ ํŒŒ์•…ํ•˜๊ณ  ์žˆ์Œ
Stacking
https://github.com/kaz-Anova/StackNetStarkNet
์„ ์ƒ๋‹˜์„ ์ž˜ ํŒŒ์•…ํ•˜๊ณ  ์žˆ์Œ
Boosting
Algorithm ํŠน์ง•
AdaBoost ๋‹ค์ˆ˜๊ฒฐ์„ ํ†ตํ•œ ์ •๋‹ต ๋ถ„๋ฅ˜ ๋ฐ ์˜ค๋‹ต์— ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ
GBM Loss Function์˜ Gradient๋ฅผ ํ†ตํ•ด ์˜ค๋‹ต์— ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ
Xgboost GBM ๋Œ€๋น„ ์„ฑ๋Šฅ ํ–ฅ์ƒ
์‹œ์Šคํ…œ ์ž์› ํšจ์œจ์  ํ™œ์šฉ(CPU, Mem)
Kaggle์„ ํ†ตํ•œ ์„ฑ๋Šฅ ๊ฒ€์ฆ(๋งŽ์€ ์ƒ์œ„ ๋žญ์ปค๊ฐ€ ์‚ฌ์šฉ)
2014๋…„ ๊ณต๊ฐœ
Light GBM Xgboost ๋Œ€๋น„ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ์ž์›์†Œ๋ชจ ์ตœ์†Œํ™”
Xgboost๊ฐ€ ์ฒ˜๋ฆฌํ•˜์ง€ ๋ชปํ•˜๋Š” ๋Œ€์šฉ๋Ÿ‰ ๋ฐ์ดํ„ฐ ํ•™์Šต ๊ฐ€๋Šฅ
Approximates the split์„ ํ†ตํ•œ ์„ฑ๋Šฅ ํ–ฅ์ƒ
2016๋…„ ๊ณต๊ฐœ
Boosting Algorithm
AdaBoost (Adaptive Boosting)
๋ถ„๋ฅ˜๋ฅผ ๋ชปํ•˜๋ฉด ๊ฐ€์ค‘์น˜๊ฐ€ ๊ณ„์† ๋‚จ์Œ
AdaBoost (Adaptive Boosting)
Cost Function : ๊ฐ€์ค‘์น˜(W)๋ฅผ ๋ฐ˜์˜
GBM (Gradient Boosting)
AdaBoost์™€ ๊ธฐ๋ณธ ๊ฐœ๋…์€ ๋™์ผ
๊ฐ€์ค‘์น˜ ๊ณ„์‚ฐ ๋ฐฉ์‹์ด Gradient Descent ์ด์šฉ ํ•˜์—ฌ ์ตœ์ ์˜ ํŒŒ๋ผ๋ฉ”ํ„ฐ ์ฐพ๊ธฐ
Gradient Decent
GBM (Gradient Boosting)
Gradient Decent
Alpha
Beta
Gamma
XGBoost (eXtreme Gradient Boosting)
GBM + ๋ถ„์‚ฐ / ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ ์ง€์›
XGBoost (eXtreme Gradient Boosting)
CART(Classification And Regression Trees) ์ง‘ํ•ฉ
XGBoost (eXtreme Gradient Boosting)
Tree๋ฅผ ์–ด๋–ป๊ฒŒ ๋ถ„๋ฆฌ ๋ฐฉ๋ฒ•
์™ผ์ชฝ Leaf Score ์˜ค๋ฅธ์ชฝ Leaf Score
์›๋ณธ Leaf Score
Regularization
Gain๊ฐ’์ด ๐œธ ๋ณด๋‹ค ์ž‘์œผ๋ฉด ๋ถ„๋ฆฌ ํ•˜์ง€ ์•Š์Œ
XGBoost (eXtreme Gradient Boosting)
Light GBM
โ€ข Decision Tree ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ธฐ๋ฐ˜์˜ GBM ํ”„๋ ˆ์ž„์›Œํฌ (๋น ๋ฅด๊ณ , ๋†’์€ ์„ฑ๋Šฅ)
โ€ข Ranking, classification ๋“ฑ์˜ ๋ฌธ์ œ์— ํ™œ์šฉ
์ฐจ์ด์ 
โ€ข Leaf-wise๋กœ tree๋ฅผ ์„ฑ์žฅ(์ˆ˜์ง ๋ฐฉํ–ฅ) , ๋‹ค๋ฅธ ์•Œ๊ณ ๋ฆฌ์ฆ˜ (Level-wise)
โ€ข ์ตœ๋Œ€ delta loss์˜ leaf๋ฅผ ์„ฑ์žฅ
โ€ข ๋™์ผํ•œ leaf๋ฅผ ์„ฑ์žฅํ• ๋•Œ, Leaf-wise๊ฐ€ loss๋ฅผ ๋” ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.
Light GBM ์ธ๊ธฐ
โ€ข ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ‘๋ ฌ๋กœ ๋น ๋ฅด๊ฒŒ ํ•™์Šต๊ฐ€๋Šฅ (Low Memory, GPU ํ™œ์šฉ๊ฐ€๋Šฅ)
โ€ข ์˜ˆ์ธก์ •ํ™•๋„๊ฐ€ ๋” ๋†’์Œ(Leaf-wise tree์˜ ์žฅ์  ร  ๊ณผ์ ‘ํ•ฉ์— ๋ฏผ๊ฐ)
์†๋„
โ€ข XGBoost ๋Œ€๋น„ 2~10๋ฐฐ (๋™์ผํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •์‹œ)
์‚ฌ์šฉ ๋นˆ๋„
โ€ข Light GBM์ด ์„ค์น˜๋œ ํˆด์ด ๋งŽ์ด ์—†์Œ. XGBoost(2014), Light GBM(2016)
ํ™œ์šฉ
โ€ข Leaf-wise Tree๋Š” overfitting์— ๋ฏผ๊ฐํ•˜์—ฌ, ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ ํ•™์Šต์— ์ ํ•ฉ
โ€ข ์ ์–ด๋„ 10,000 ๊ฑด ์ด์ƒ
์ฐธ์กฐ์ž๋ฃŒ
https://mlwave.com/kaggle-ensembling-guide/
http://pythonkim.tistory.com/42
https://www.slideshare.net/potaters/decision-forests-and-discriminant-analysis
https://github.com/kaz-Anova/StackNet
https://swalloow.github.io/bagging-boosting
https://www.slideshare.net/freepsw/boosting-bagging-vs-boosting
https://www.youtube.com/watch?v=GM3CDQfQ4sw
http://blog.kaggle.com/2017/06/15/stacking-made-easy-an-introduction-to-stacknet-by-competitions-grandmaster-
marios-michailidis-kazanova/
https://www.analyticsvidhya.com/blog/2015/09/complete-guide-boosting-methods/
https://communedeart.com/2017/06/25/xgboost-%EC%82%AC%EC%9A%A9%ED%95%98%EA%B8%B0/
http://xgboost.readthedocs.io/en/latest/model.html
๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

Machine learning boosting 20180424