SlideShare a Scribd company logo
Forest Tree Algorithm
Random Forest is a popular machine learning algorithm that belongs to
the supervised learning technique.
It can be used for both Classification and Regression problems in ML. It
is based on the concept of ensemble learning,
which is a process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model.
"Random Forest is a classifier that contains a number of decision
trees on various subsets of the given dataset and takes the average to
improve the predictive accuracy of that dataset
The greater number of trees in the forest leads to higher accuracy and
prevents the problem of overfitting.
• Assumptions for Random Forest
• here should be some actual values in the feature variable of the
dataset so that the classifier can predict accurate results rather than a
guessed result.
• The predictions from each tree must have very low correlations.
• Why use Random Forest?
• It takes less training time as compared to other algorithms.
• It predicts output with high accuracy, even for the large dataset it
runs efficiently.
• It can also maintain accuracy when a large proportion of data is
missing.
• How does Random Forest algorithm work?
• Random Forest works in two-phase first is to create the random forest
by combining N decision tree, and second is to make predictions for
each tree created in the first phase.
• The Working process can be explained in the below steps and
diagram:
• Step-1: Select random K data points from the training set.
• Step-2: Build the decision trees associated with the selected data
points (Subsets).
• Step-3: Choose the number N for decision trees that you want to
build.
• Step-4: Repeat Step 1 & 2.
• Step-5: For new data points, find the predictions of each decision
tree, and assign the new data points to the category that wins the
majority votes.
• Example: Suppose there is a dataset that contains multiple fruit
images. So, this dataset is given to the Random forest classifier. The
dataset is divided into subsets and given to each decision tree. During
the training phase, each decision tree produces a prediction result,
and when a new data point occurs, then based on the majority of
results, the Random Forest classifier predicts the final decision.
Consider the below image:
• Applications of Random Forest
• There are mainly four sectors where Random forest mostly used:
• Banking: Banking sector mostly uses this algorithm for the identification of
loan risk.
• Medicine: With the help of this algorithm, disease trends and risks of the
disease can be identified.
• Land Use: We can identify the areas of similar land use by this algorithm.
• Marketing: Marketing trends can be identified using this algorithm.
• Advantages of Random Forest
• Random Forest is capable of performing both Classification and Regression
tasks.
• It is capable of handling large datasets with high dimensionality.
• It enhances the accuracy of the model and prevents the overfitting issue.
• Disadvantages of Random Forest
• Although random forest can be used for both classification and regression
tasks, it is not more suitable for Regression tasks.
• Implementation Steps are given below:
• Data Pre-processing step
• Fitting the Random forest algorithm to the Training set
• Predicting the test result
• Test accuracy of the result (Creation of Confusion matrix)
• Visualizing the test set result.
Forest Tree Algorithm in Python
• # importing libraries
• import numpy as nm
• import matplotlib.pyplot as mtp
• import pandas as pd
•
• #importing datasets
• data_set= pd.read_csv('User_Data.csv')
•
• #Extracting Independent and dependent Variable
• x= data_set.iloc[:, [2,3]].values
• y= data_set.iloc[:, 4].values
• # Splitting the dataset into training and test set.
• from sklearn.model_selection import train_test_split
• x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25,
random_state=0)
•
• #feature Scaling
• from sklearn.preprocessing import StandardScaler
• st_x= StandardScaler()
• x_train= st_x.fit_transform(x_train)
• x_test= st_x.transform(x_test)
• #Fitting Decision Tree classifier to the training set
• from sklearn.ensemble import RandomForestClassifier
• classifier= RandomForestClassifier(n_estimators= 10, criterion="entropy")
• classifier.fit(x_train, y_train)
• #Predicting the test set result
• y_pred= classifier.predict(x_test)
• #Creating the Confusion matrix
• from sklearn.metrics import confusion_matrix
• cm= confusion_matrix(y_test, y_pred)
• #Visualizing the training Set result
• from matplotlib.colors import ListedColormap
• x_set, y_set = x_train, y_train
• x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
• nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step =
0.01))
• mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
• alpha = 0.75, cmap = ListedColormap(('purple','green' )))
• mtp.xlim(x1.min(), x1.max())
• mtp.ylim(x2.min(), x2.max())
• for i, j in enumerate(nm.unique(y_set)):
• mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
• c = ListedColormap(('purple', 'green'))(i), label = j)
• mtp.title('Random Forest Algorithm (Training set)')
• mtp.xlabel('Age')
• mtp.ylabel('Estimated Salary')
• mtp.legend()
• mtp.show()
Random Forest Decision Tree.pptx

More Related Content

Similar to Random Forest Decision Tree.pptx

Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
DotNetCampus
 

Similar to Random Forest Decision Tree.pptx (20)

Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Support Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random ForestSupport Vector machine(SVM) and Random Forest
Support Vector machine(SVM) and Random Forest
 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
Lecture 7 - Bias, Variance and Regularization, a lecture in subject module St...
 
Facial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional FaceFacial Emotion Detection on Children's Emotional Face
Facial Emotion Detection on Children's Emotional Face
 
Intro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft VenturesIntro to Machine Learning by Microsoft Ventures
Intro to Machine Learning by Microsoft Ventures
 
Competition16
Competition16Competition16
Competition16
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
 
Dimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine LearningDimensionality Reduction in Machine Learning
Dimensionality Reduction in Machine Learning
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.ppt
 

More from Ramakrishna Reddy Bijjam

More from Ramakrishna Reddy Bijjam (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Arrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptxArrays to arrays and pointers with arrays.pptx
Arrays to arrays and pointers with arrays.pptx
 
Auxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptxAuxiliary, Cache and Virtual memory.pptx
Auxiliary, Cache and Virtual memory.pptx
 
Python With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptxPython With MongoDB in advanced Python.pptx
Python With MongoDB in advanced Python.pptx
 
Pointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptxPointers and single &multi dimentionalarrays.pptx
Pointers and single &multi dimentionalarrays.pptx
 
Certinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptxCertinity Factor and Dempster-shafer theory .pptx
Certinity Factor and Dempster-shafer theory .pptx
 
Auxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptxAuxiliary Memory in computer Architecture.pptx
Auxiliary Memory in computer Architecture.pptx
 
K Means Clustering in ML.pptx
K Means Clustering in ML.pptxK Means Clustering in ML.pptx
K Means Clustering in ML.pptx
 
Pandas.pptx
Pandas.pptxPandas.pptx
Pandas.pptx
 
Python With MongoDB.pptx
Python With MongoDB.pptxPython With MongoDB.pptx
Python With MongoDB.pptx
 
Python with MySql.pptx
Python with MySql.pptxPython with MySql.pptx
Python with MySql.pptx
 
PYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdfPYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdf
 
BInary file Operations.pptx
BInary file Operations.pptxBInary file Operations.pptx
BInary file Operations.pptx
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
 
CSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptxCSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptx
 
HTML files in python.pptx
HTML files in python.pptxHTML files in python.pptx
HTML files in python.pptx
 
Regular Expressions in Python.pptx
Regular Expressions in Python.pptxRegular Expressions in Python.pptx
Regular Expressions in Python.pptx
 
datareprersentation 1.pptx
datareprersentation 1.pptxdatareprersentation 1.pptx
datareprersentation 1.pptx
 
Apriori.pptx
Apriori.pptxApriori.pptx
Apriori.pptx
 
Eclat.pptx
Eclat.pptxEclat.pptx
Eclat.pptx
 

Recently uploaded

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 

Recently uploaded (20)

社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 

Random Forest Decision Tree.pptx

  • 1. Forest Tree Algorithm Random Forest is a popular machine learning algorithm that belongs to the supervised learning technique. It can be used for both Classification and Regression problems in ML. It is based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model. "Random Forest is a classifier that contains a number of decision trees on various subsets of the given dataset and takes the average to improve the predictive accuracy of that dataset The greater number of trees in the forest leads to higher accuracy and prevents the problem of overfitting.
  • 2.
  • 3. • Assumptions for Random Forest • here should be some actual values in the feature variable of the dataset so that the classifier can predict accurate results rather than a guessed result. • The predictions from each tree must have very low correlations. • Why use Random Forest? • It takes less training time as compared to other algorithms. • It predicts output with high accuracy, even for the large dataset it runs efficiently. • It can also maintain accuracy when a large proportion of data is missing.
  • 4. • How does Random Forest algorithm work? • Random Forest works in two-phase first is to create the random forest by combining N decision tree, and second is to make predictions for each tree created in the first phase. • The Working process can be explained in the below steps and diagram: • Step-1: Select random K data points from the training set. • Step-2: Build the decision trees associated with the selected data points (Subsets). • Step-3: Choose the number N for decision trees that you want to build. • Step-4: Repeat Step 1 & 2. • Step-5: For new data points, find the predictions of each decision tree, and assign the new data points to the category that wins the majority votes.
  • 5. • Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the Random forest classifier. The dataset is divided into subsets and given to each decision tree. During the training phase, each decision tree produces a prediction result, and when a new data point occurs, then based on the majority of results, the Random Forest classifier predicts the final decision. Consider the below image:
  • 6. • Applications of Random Forest • There are mainly four sectors where Random forest mostly used: • Banking: Banking sector mostly uses this algorithm for the identification of loan risk. • Medicine: With the help of this algorithm, disease trends and risks of the disease can be identified. • Land Use: We can identify the areas of similar land use by this algorithm. • Marketing: Marketing trends can be identified using this algorithm. • Advantages of Random Forest • Random Forest is capable of performing both Classification and Regression tasks. • It is capable of handling large datasets with high dimensionality. • It enhances the accuracy of the model and prevents the overfitting issue. • Disadvantages of Random Forest • Although random forest can be used for both classification and regression tasks, it is not more suitable for Regression tasks.
  • 7. • Implementation Steps are given below: • Data Pre-processing step • Fitting the Random forest algorithm to the Training set • Predicting the test result • Test accuracy of the result (Creation of Confusion matrix) • Visualizing the test set result.
  • 8. Forest Tree Algorithm in Python • # importing libraries • import numpy as nm • import matplotlib.pyplot as mtp • import pandas as pd • • #importing datasets • data_set= pd.read_csv('User_Data.csv') • • #Extracting Independent and dependent Variable • x= data_set.iloc[:, [2,3]].values • y= data_set.iloc[:, 4].values
  • 9. • # Splitting the dataset into training and test set. • from sklearn.model_selection import train_test_split • x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0) • • #feature Scaling • from sklearn.preprocessing import StandardScaler • st_x= StandardScaler() • x_train= st_x.fit_transform(x_train) • x_test= st_x.transform(x_test) • #Fitting Decision Tree classifier to the training set • from sklearn.ensemble import RandomForestClassifier • classifier= RandomForestClassifier(n_estimators= 10, criterion="entropy") • classifier.fit(x_train, y_train) • #Predicting the test set result • y_pred= classifier.predict(x_test)
  • 10. • #Creating the Confusion matrix • from sklearn.metrics import confusion_matrix • cm= confusion_matrix(y_test, y_pred) • #Visualizing the training Set result • from matplotlib.colors import ListedColormap • x_set, y_set = x_train, y_train • x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step =0.01), • nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01)) • mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape(x1.shape), • alpha = 0.75, cmap = ListedColormap(('purple','green' ))) • mtp.xlim(x1.min(), x1.max()) • mtp.ylim(x2.min(), x2.max())
  • 11. • for i, j in enumerate(nm.unique(y_set)): • mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], • c = ListedColormap(('purple', 'green'))(i), label = j) • mtp.title('Random Forest Algorithm (Training set)') • mtp.xlabel('Age') • mtp.ylabel('Estimated Salary') • mtp.legend() • mtp.show()