SlideShare a Scribd company logo
1 of 23
XgBoost Python Package
eXtreme Gradient Boosting
Presented by :
Fareed Kang
Harsh Gupta
Suman Kr. Suman
XgBoost ?
● XgBoost is a powerful machine learning algorithm
● It is designed to optimize performance and computational speed
● Open Source
● Primary C++ implementation Created by Tianqi Chen
● Widely accepted algorithms in Kaggle Competition
Prerequisite ● Decision Tree ● Gradient Boosting
Prerequisite ● Decision Tree ● Gradient Boosting
Root
Internal
Leaf Leaf
1. Nodes and Branches
- A Decision Tree consists of nodes and branches.
- Nodes represent decisions or questions based on features.
- Branches represent possible outcomes or decisions based on those questions.
2. Root Node
- The top-most node is called the root node.
3. Internal Nodes
- Nodes other than the root node are internal nodes.
- They represent intermediate decisions/questions.
4. Leaf Nodes
- Terminal nodes are called leaf nodes.
- They represent the final outcomes or predictions.
Prerequisite ● Decision Tree ● Gradient Boosting
Example
Prerequisite ● Decision Tree ● Gradient Boosting
Example :
Depth 1
Prerequisite ● Decision Tree ● Gradient Boosting
Example : Sample at Depth 1
Prerequisite ● Decision Tree ● Gradient Boosting
Example :
Decision Tree Depth 2
Prerequisite ● Decision Tree ● Gradient Boosting
Example :
Process will repeat and we will reach to
final Decision Tree
Prerequisite ● Decision Tree ● Gradient Boosting
Gradient boosting is one of the most powerful techniques for building predictive models, and it is called
a Generalization of AdaBoost. The main objective of Gradient Boost is to minimize the loss function by
adding weak learners using a gradient descent optimization algorithm.
Gradient Boost has three main components.
● Loss Function: The role of the loss function is to estimate how best is the model in making
predictions with the given data. This could vary depending on the type of the problem.
● Weak Learner: Weak learner is one that classifies the data so poorly when compared to
random guessing. The weak learners are mostly decision trees.
● Additive Model: It is an iterative and sequential process in adding the decision trees one
step at a time. A gradient descent procedure is used to minimize the loss when adding
trees. Here instead of changing the weak learner we add a new parametrized DT to predict
residual.
Prerequisite ● Decision Tree ● Gradient Boosting
XgBoost : Evolution
XgBoost : Newton-Raphson
Optimization Method
XgBoost : Benefits?
XgBoost Package in Python : Installation
Photo by Pexels
Installing through pip in cmd
Installing through pip in colab/notebook code cell
XgBoost Package in Python : Classifier model
Photo by Pexels
# Importing
from sklearn.model_selection import train_test_split
import xgboost as xgb
from xgboost import XGBClassifier
# Initialize the XGBoost classifier (or regressor)
xgb_model = XGBClassifier(objective='binary:logistic',
early_stopping_rounds=10,
eval_metric='aucpr',
missing=0)
Photo by Pexels
# Train the model
xgb_model.fit(X_train, y_train, verbose=True,
eval_set=[(X_test, y_test)])
# Prediction
y_pred = xgb_model.predict(X_test)
XgBoost Package in Python : Classifier model
XgBoost Package in Python : Regressor model
Photo by Pexels
from sklearn.datasets import fetch_california_housing
from sklearn.metrics import mean_squared_error as mse
# Load the Boston House Price dataset
boston = fetch_california_housing()
X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = pd.Series(boston.target, name='target')
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
XgBoost Package in Python : Regressor model
Photo by Pexels
from xgboost import XGBRegressor
# Create an XGBoost regressor
xgb_reg = xgb.XGBRegressor(objective='reg:squarederror',
random_state=42)
# Fit the model on the training data
xgb_reg.fit(X_train, y_train)
# Make predictions on the test data
y_pred = xgb_reg.predict(X_test)
# Calculate the Mean Squared Error
print(f"Mean Squared Error: {mse(y_test, y_pred):.2f}")
XgBoost : Common Parameters
Photo by Pexels
❖ booster: Specifies the type of boosting model to use. It can
be one of the following:
➢ gbtree: Tree-based models (default).
➢ gblinear: Linear models.
➢ dart: Dropouts meet Multiple Additive Regression Trees.
❖ n_estimators: The number of boosting rounds (trees) to
train. Increasing this value can lead to better performance
but also longer training times.
❖ learning_rate: Controls the step size at each iteration.
❖ max_depth: The maximum depth of each tree.
❖ nthread : Number of parallel threads used to run XGBoost.
XgBoost : XGBClassifier Parameters
❖ objective: Specifies the learning task and corresponding
objective function. Common options include:
➢ 'binary:logistic': Binary classification.
➢ 'multi:softmax': Multiclass classification.
➢ 'multi:softprob': Multiclass classification with
probabilities.
❖ eval_metric: The evaluation metric used during training.
Common options include:
➢ 'logloss': Logarithmic loss for binary classification.
➢ 'mlogloss': Multiclass logarithmic loss.
➢ 'auc': Area under the ROC curve.
❖ num_class: Number of classes in the dataset (for multiclass
classification).
XgBoost : XGBRegressor Parameters
❖ objective: Specifies the learning task and corresponding
objective function. Common options include:
➢ 'reg:squarederror': Linear regression for mean squared
error (default).
➢ 'reg:squaredlogerror': Regression for mean squared log
error.
➢ 'reg:logistic': Regression with logistic loss (for binary
regression).
➢ 'count:poisson': Poisson regression for count data.
❖ eval_metric: The evaluation metric used during training. Common
options include:
➢ 'rmse': Root Mean Squared Error (default for regression).
➢ 'mae': Mean Absolute Error.
➢ 'logloss': Logarithmic loss (for Poisson regression)
Refrences
● https://towardsdatascience.com/decision-tree-classifier-explained-in-real-life-picking-
a-vacation-destination-6226b2b60575
● https://explained.ai/gradient-boosting/
● https://towardsdatascience.com/https-medium-com-vishalmorde-xgboost-algorithm-
long-she-may-rein-edd9f99be63d
● https://www.nvidia.com/en-us/glossary/data-science/xgboost/
● Paper : https://arxiv.org/pdf/1603.02754.pdf
● Official Doc : https://xgboost.readthedocs.io/en/stable/
● How Parallelization in XgBOOST : http://zhanpengfang.github.io/418home.html
Photo by Pexels
To get documentation of any object in Python : ?Object_name i.e. Question Marks then object
name

More Related Content

What's hot

2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Decision Tree and Bayesian Classification
Decision Tree and Bayesian ClassificationDecision Tree and Bayesian Classification
Decision Tree and Bayesian ClassificationKomal Kotak
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.pptbutest
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Machine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree LearningMachine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree Learningbutest
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning TechniquesBabu Priyavrat
 
Introduction of Xgboost
Introduction of XgboostIntroduction of Xgboost
Introduction of Xgboostmichiaki ito
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: ClusteringDeepak George
 
Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted treesNihar Ranjan
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesMarc Garcia
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoostJoonyoung Yi
 
Travelling Salesman Problem
Travelling Salesman ProblemTravelling Salesman Problem
Travelling Salesman ProblemShikha Gupta
 
Random forest
Random forestRandom forest
Random forestUjjawal
 

What's hot (20)

Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Decision Tree and Bayesian Classification
Decision Tree and Bayesian ClassificationDecision Tree and Bayesian Classification
Decision Tree and Bayesian Classification
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.ppt
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Decision tree
Decision treeDecision tree
Decision tree
 
Machine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree LearningMachine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree Learning
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Introduction of Xgboost
Introduction of XgboostIntroduction of Xgboost
Introduction of Xgboost
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: Clustering
 
Gradient Boosted trees
Gradient Boosted treesGradient Boosted trees
Gradient Boosted trees
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression Trees
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoost
 
Travelling Salesman Problem
Travelling Salesman ProblemTravelling Salesman Problem
Travelling Salesman Problem
 
Random forest
Random forestRandom forest
Random forest
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 

Similar to XgBoost.pptx

Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn철민 권
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat omarodibat
 
A brief introduction to Searn Algorithm
A brief introduction to Searn AlgorithmA brief introduction to Searn Algorithm
A brief introduction to Searn AlgorithmSupun Abeysinghe
 
GBM package in r
GBM package in rGBM package in r
GBM package in rmark_landry
 
XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxyadav834181
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitionsOwen Zhang
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로 모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로 r-kor
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash courseVishwas N
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneXiaoweiJiang7
 
LightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slideLightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slideriahaque1950
 

Similar to XgBoost.pptx (20)

Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
chapter1.pdf
chapter1.pdfchapter1.pdf
chapter1.pdf
 
Understanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-LearnUnderstanding GBM and XGBoost in Scikit-Learn
Understanding GBM and XGBoost in Scikit-Learn
 
Demystifying Xgboost
Demystifying XgboostDemystifying Xgboost
Demystifying Xgboost
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat
 
A brief introduction to Searn Algorithm
A brief introduction to Searn AlgorithmA brief introduction to Searn Algorithm
A brief introduction to Searn Algorithm
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
GBM package in r
GBM package in rGBM package in r
GBM package in r
 
XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptx
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로 모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
모듈형 패키지를 활용한 나만의 기계학습 모형 만들기 - 회귀나무모형을 중심으로
 
C3 w3
C3 w3C3 w3
C3 w3
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Ijcai 2020
Ijcai 2020Ijcai 2020
Ijcai 2020
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
Cutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tuneCutting edge hyperparameter tuning made simple with ray tune
Cutting edge hyperparameter tuning made simple with ray tune
 
LightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slideLightGBM and Multilayer perceptron (MLP) slide
LightGBM and Multilayer perceptron (MLP) slide
 

Recently uploaded

Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 

Recently uploaded (20)

Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 

XgBoost.pptx

  • 1. XgBoost Python Package eXtreme Gradient Boosting Presented by : Fareed Kang Harsh Gupta Suman Kr. Suman
  • 2. XgBoost ? ● XgBoost is a powerful machine learning algorithm ● It is designed to optimize performance and computational speed ● Open Source ● Primary C++ implementation Created by Tianqi Chen ● Widely accepted algorithms in Kaggle Competition
  • 3. Prerequisite ● Decision Tree ● Gradient Boosting
  • 4. Prerequisite ● Decision Tree ● Gradient Boosting Root Internal Leaf Leaf 1. Nodes and Branches - A Decision Tree consists of nodes and branches. - Nodes represent decisions or questions based on features. - Branches represent possible outcomes or decisions based on those questions. 2. Root Node - The top-most node is called the root node. 3. Internal Nodes - Nodes other than the root node are internal nodes. - They represent intermediate decisions/questions. 4. Leaf Nodes - Terminal nodes are called leaf nodes. - They represent the final outcomes or predictions.
  • 5. Prerequisite ● Decision Tree ● Gradient Boosting Example
  • 6. Prerequisite ● Decision Tree ● Gradient Boosting Example : Depth 1
  • 7. Prerequisite ● Decision Tree ● Gradient Boosting Example : Sample at Depth 1
  • 8. Prerequisite ● Decision Tree ● Gradient Boosting Example : Decision Tree Depth 2
  • 9. Prerequisite ● Decision Tree ● Gradient Boosting Example : Process will repeat and we will reach to final Decision Tree
  • 10. Prerequisite ● Decision Tree ● Gradient Boosting Gradient boosting is one of the most powerful techniques for building predictive models, and it is called a Generalization of AdaBoost. The main objective of Gradient Boost is to minimize the loss function by adding weak learners using a gradient descent optimization algorithm. Gradient Boost has three main components. ● Loss Function: The role of the loss function is to estimate how best is the model in making predictions with the given data. This could vary depending on the type of the problem. ● Weak Learner: Weak learner is one that classifies the data so poorly when compared to random guessing. The weak learners are mostly decision trees. ● Additive Model: It is an iterative and sequential process in adding the decision trees one step at a time. A gradient descent procedure is used to minimize the loss when adding trees. Here instead of changing the weak learner we add a new parametrized DT to predict residual.
  • 11. Prerequisite ● Decision Tree ● Gradient Boosting
  • 15. XgBoost Package in Python : Installation Photo by Pexels Installing through pip in cmd Installing through pip in colab/notebook code cell
  • 16. XgBoost Package in Python : Classifier model Photo by Pexels # Importing from sklearn.model_selection import train_test_split import xgboost as xgb from xgboost import XGBClassifier # Initialize the XGBoost classifier (or regressor) xgb_model = XGBClassifier(objective='binary:logistic', early_stopping_rounds=10, eval_metric='aucpr', missing=0)
  • 17. Photo by Pexels # Train the model xgb_model.fit(X_train, y_train, verbose=True, eval_set=[(X_test, y_test)]) # Prediction y_pred = xgb_model.predict(X_test) XgBoost Package in Python : Classifier model
  • 18. XgBoost Package in Python : Regressor model Photo by Pexels from sklearn.datasets import fetch_california_housing from sklearn.metrics import mean_squared_error as mse # Load the Boston House Price dataset boston = fetch_california_housing() X = pd.DataFrame(boston.data, columns=boston.feature_names) y = pd.Series(boston.target, name='target') # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
  • 19. XgBoost Package in Python : Regressor model Photo by Pexels from xgboost import XGBRegressor # Create an XGBoost regressor xgb_reg = xgb.XGBRegressor(objective='reg:squarederror', random_state=42) # Fit the model on the training data xgb_reg.fit(X_train, y_train) # Make predictions on the test data y_pred = xgb_reg.predict(X_test) # Calculate the Mean Squared Error print(f"Mean Squared Error: {mse(y_test, y_pred):.2f}")
  • 20. XgBoost : Common Parameters Photo by Pexels ❖ booster: Specifies the type of boosting model to use. It can be one of the following: ➢ gbtree: Tree-based models (default). ➢ gblinear: Linear models. ➢ dart: Dropouts meet Multiple Additive Regression Trees. ❖ n_estimators: The number of boosting rounds (trees) to train. Increasing this value can lead to better performance but also longer training times. ❖ learning_rate: Controls the step size at each iteration. ❖ max_depth: The maximum depth of each tree. ❖ nthread : Number of parallel threads used to run XGBoost.
  • 21. XgBoost : XGBClassifier Parameters ❖ objective: Specifies the learning task and corresponding objective function. Common options include: ➢ 'binary:logistic': Binary classification. ➢ 'multi:softmax': Multiclass classification. ➢ 'multi:softprob': Multiclass classification with probabilities. ❖ eval_metric: The evaluation metric used during training. Common options include: ➢ 'logloss': Logarithmic loss for binary classification. ➢ 'mlogloss': Multiclass logarithmic loss. ➢ 'auc': Area under the ROC curve. ❖ num_class: Number of classes in the dataset (for multiclass classification).
  • 22. XgBoost : XGBRegressor Parameters ❖ objective: Specifies the learning task and corresponding objective function. Common options include: ➢ 'reg:squarederror': Linear regression for mean squared error (default). ➢ 'reg:squaredlogerror': Regression for mean squared log error. ➢ 'reg:logistic': Regression with logistic loss (for binary regression). ➢ 'count:poisson': Poisson regression for count data. ❖ eval_metric: The evaluation metric used during training. Common options include: ➢ 'rmse': Root Mean Squared Error (default for regression). ➢ 'mae': Mean Absolute Error. ➢ 'logloss': Logarithmic loss (for Poisson regression)
  • 23. Refrences ● https://towardsdatascience.com/decision-tree-classifier-explained-in-real-life-picking- a-vacation-destination-6226b2b60575 ● https://explained.ai/gradient-boosting/ ● https://towardsdatascience.com/https-medium-com-vishalmorde-xgboost-algorithm- long-she-may-rein-edd9f99be63d ● https://www.nvidia.com/en-us/glossary/data-science/xgboost/ ● Paper : https://arxiv.org/pdf/1603.02754.pdf ● Official Doc : https://xgboost.readthedocs.io/en/stable/ ● How Parallelization in XgBOOST : http://zhanpengfang.github.io/418home.html Photo by Pexels To get documentation of any object in Python : ?Object_name i.e. Question Marks then object name