SlideShare a Scribd company logo
1 of 35
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What Will You Learn Today?
Why Random Forest?Introduction What is Random Forest?
Random Forest - Example How Random Forest Works? Demo In R: Diabetes
Prevention Use Case
1 2 3
4 65
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Introduction
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Introduction To Classification
 Classification is the problem of identifying to
which set of categories a new observation
belongs.
 It is a supervised learning model as the
classifier already has a set of classified examples
and from these examples, the classifier learns to
assign unseen new examples.
 Example: Assigning a given email into "spam"
or "non-spam" category.
Is this A or B ?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Types Of Classifiers
Decision Tree
• Decision tree builds classification
models in the form of a tree
structure.
• It breaks down a dataset into
smaller and smaller subsets.
• Random Forest is an ensemble
classifier made using many
decision tree models.
• Ensemble models combine the
results from different models.
Random Forest Naïve Bayes
• It is a classification technique
based on Bayes' Theorem with an
assumption of independence
among attributes.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Why Random Forest?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Use Case - Credit Risk Detection
 To minimize loss, the bank needs a
decision rule to predict whom to give
approval of the loan.
 An applicant’s demographic (income,
debts, credit history) and socio-economic
profiles are considered.
 Data science can help banks recognize
behavior patterns and provide a
complete view of individual customers.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Use Case - Credit Risk Detection
student
Risk
Credit history
Bank Balance
age
Risk No Risk
No RiskRisk
Final outcome
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What is Random Forest?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What Is Random Forest?
 Random Forest - a versatile algorithm capable of
performing both
i) Regression
ii) Classification
 It is a type of ensemble learning method
 Commonly used predictive modelling and machine
learning technique
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest - Example
Let’ say you want to decide if to watch “Edge of
Tomorrow” or not.
So you will decide based on following two actions.
(i) You can ask your best friend
(ii) You can ask bunch of friends.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest - Example
To figure out if you will like “Edge of Tomorrow”
or not, your friend will analyze a few things as:
(i) If you like Adventure and Action
(ii) If you like Emily Blunt
Thus, a decision tree is created by your best
friend.
Ask best friend
Genre -
Adventure
Yes
Cast - Emily
Blunt
No
Is Emily Blunt
main lead?
Like Don’t Like
Yes No
Like Don’t Like
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest - Example
In order to get more accurate recommendations,
you will have to ask bunch of friends, say #Friend1,
#Friend2, #Friend3 and consider their vote.
Each one of them may take movies of different
genre and further decide.
The majority of the votes will decide the final
outcome.
Thus you build random forest of group of friends.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest - Example
Friend 1
Top Gun
Action
movies
Yes No
Like Don’t Like
Yes
Like
No
Godzilla
Don’t Like
Friend 3
Far and
Away
Yes
Oblivion
Like
No
Like
Friend 2
Tom
Cruise
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest Use Cases
Banking
Remote sensing
Medicine
Banking
Identification of loan risk applicants by their
probability of defaulting payments.
Medicine
Identification of at-risk patients and disease trends.
Land Use
Identification of areas of similar land use.
Marketing
Identifying customer churn.
Use-cases
Marketing
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How Random Forest Works?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Random Forest Algorithm
i.Randomly select m features from T;
where 𝑚≪T
i.For node d, calculate the best split point among the 𝑚
feature
i.Split the node into two daughter nodes using the best split
Repeat first three steps until 𝑛 number of nodes has been
reached
Build your forest by repeating steps i–iv for 𝐷 number of
times
 T: number of features
 𝐷: number of trees to be constructed
 𝑉: Output: the class with the highest vote
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How Random Forest Works?
Let’s take an example,
We have taken dataset consisting of:
• Weather information of last 14 days
• Whether match was played or not on that particular day
Now using the random forest we need to predict whether the
game will happen if the weather condition is
Outlook = Rain
Humidity = High
Wind = Weak
Play = ?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How Random Forest Works?
 The first step in Random forest is that it will divide the data into smaller
subsets.
 Every subsets need not be distinct, some subsets maybe overlapped
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How Random Forest Works?
D1,D2,D3
Overcast
Wind
Play No Play
Play
D7,D8,D9
Overcast
Play
No play Play
Humidity
D3,D4,D5,D6
Wind
Overcast
Play
Wind
Humidity
PlayPlay No play No play
Play
Play
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Features of Random Forest
Most accurate learning algorithms
Works well for both classification and regression problems
Runs efficiently on large databases
Requires almost no input preparation
Performs implicit feature selection
Can be easily grown in parallel
Methods for balancing error in unbalanced data sets
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What if we could predict the
occurrence of diabetes and
take appropriate measures
beforehand to prevent it?
Sure! Let me take you
through the steps to
predict the vulnerable
patients.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
Doctor gets the following data from the medical history of the patient.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
We will divide our entire dataset into two subsets as:
• Training dataset -> to train the model
• Testing dataset -> to validate and make predictions
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
Before we create random forest, let’s find out the best mtry value using following commands
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
 Here, we implement random forest in R using following commands.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
 We get the output as follows
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
Let’s see what all variables are most important for our model. For
plotting the we can use the following commands
As per MiniDecreaseGini value, glucose_conc is the most important variable in the model.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
Now, we can use our model to predict the output of our testing dataset.
We can use the following code for predicting the output.
 pred1_diabet<-predict(diabet_forest,newdata = diabet_test,type ="class")
 pred1_diabet
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
We get the following output for our testing dataset where:
“YES” means the probability of patient being vulnerable to diabetes is positive
“NO” means the probability of patient being vulnerable to diabetes is negative.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Model Validation
 library(caret)
 confusionMatrix(table(pred1_diabet,diabet_test$is_diabetic))
We can create confusion matrix for the model using the library caret to know how
good is our model.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
Data Acquisition
Divide dataset
Implement model
Visualize
Accuracy = 79.66%
The accuracy (or the overall success rate) is a metric defining the rate at which a
model has classified the records correctly. A good model should have a high
accuracy score
Divide dataset
Implement model
Visualize
Model Validation
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Course Details
Go to www.edureka.co/data-science
Get Edureka Certified in Data Science Today!
What our learners have to say about us!
Shravan Reddy says- “I would like to recommend any one who
wants to be a Data Scientist just one place: Edureka. Explanations
are clean, clear, easy to understand. Their support team works
very well.. I took the Data Science course and I'm going to take
Machine Learning with Mahout and then Big Data and Hadoop”.
Gnana Sekhar says - “Edureka Data science course provided me a very
good mixture of theoretical and practical training. LMS pre recorded
sessions and assignments were very good as there is a lot of
information in them that will help me in my job. Edureka is my
teaching GURU now...Thanks EDUREKA.”
Balu Samaga says - “It was a great experience to undergo and get
certified in the Data Science course from Edureka. Quality of the
training materials, assignments, project, support and other
infrastructures are a top notch.”
www.edureka.co/data-scienceEdureka’s Data Science Certification Training

More Related Content

What's hot

Classification and Regression
Classification and RegressionClassification and Regression
Classification and RegressionMegha Sharma
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | EdurekaEdureka!
 
Data Science: Applying Random Forest
Data Science: Applying Random ForestData Science: Applying Random Forest
Data Science: Applying Random ForestEdureka!
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.ASHOK KUMAR
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Edureka!
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithmRashid Ansari
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaEdureka!
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningTamir Taha
 
Understanding random forests
Understanding random forestsUnderstanding random forests
Understanding random forestsMarc Garcia
 
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
 
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | EdurekaSVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | EdurekaEdureka!
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsPalin analytics
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningShahar Cohen
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...Simplilearn
 
Intro to modelling-supervised learning
Intro to modelling-supervised learningIntro to modelling-supervised learning
Intro to modelling-supervised learningJustin Sebok
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
 

What's hot (20)

Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | Edureka
 
Data Science: Applying Random Forest
Data Science: Applying Random ForestData Science: Applying Random Forest
Data Science: Applying Random Forest
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | Edureka
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Understanding random forests
Understanding random forestsUnderstanding random forests
Understanding random forests
 
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
 
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | EdurekaSVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
 
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin AnalyticsRandom Forest Classifier in Machine Learning | Palin Analytics
Random Forest Classifier in Machine Learning | Palin Analytics
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
 
Intro to modelling-supervised learning
Intro to modelling-supervised learningIntro to modelling-supervised learning
Intro to modelling-supervised learning
 
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaUnsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
 

Similar to Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science Training | Edureka

Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Edureka!
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...Edureka!
 
Barga Data Science lecture 4
Barga Data Science lecture 4Barga Data Science lecture 4
Barga Data Science lecture 4Roger Barga
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Edureka!
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | EdurekaEdureka!
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9Roger Barga
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learningPramit Choudhary
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 
Machine learning
Machine learningMachine learning
Machine learningRohit Kumar
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxPlacementsBCA
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision TreesSara Hooker
 
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...Edureka!
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)Sanghun Kim
 
Think-Aloud Protocols
Think-Aloud ProtocolsThink-Aloud Protocols
Think-Aloud Protocolsbutest
 

Similar to Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science Training | Edureka (20)

Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
 
Barga Data Science lecture 4
Barga Data Science lecture 4Barga Data Science lecture 4
Barga Data Science lecture 4
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | Edureka
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
3 classification
3  classification3  classification
3 classification
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learning
 
Why am I doing this???
Why am I doing this???Why am I doing this???
Why am I doing this???
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Machine learning
Machine learningMachine learning
Machine learning
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptx
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
How to Become a Data Scientist | Data Scientist Skills | Data Science Trainin...
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)Comparison of Top Data Mining(Final)
Comparison of Top Data Mining(Final)
 
Think-Aloud Protocols
Think-Aloud ProtocolsThink-Aloud Protocols
Think-Aloud Protocols
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 

Recently uploaded (20)

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 

Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science Training | Edureka

  • 1. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest
  • 2. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What Will You Learn Today? Why Random Forest?Introduction What is Random Forest? Random Forest - Example How Random Forest Works? Demo In R: Diabetes Prevention Use Case 1 2 3 4 65
  • 3. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Introduction
  • 4. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Introduction To Classification  Classification is the problem of identifying to which set of categories a new observation belongs.  It is a supervised learning model as the classifier already has a set of classified examples and from these examples, the classifier learns to assign unseen new examples.  Example: Assigning a given email into "spam" or "non-spam" category. Is this A or B ?
  • 5. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Types Of Classifiers Decision Tree • Decision tree builds classification models in the form of a tree structure. • It breaks down a dataset into smaller and smaller subsets. • Random Forest is an ensemble classifier made using many decision tree models. • Ensemble models combine the results from different models. Random Forest Naïve Bayes • It is a classification technique based on Bayes' Theorem with an assumption of independence among attributes.
  • 6. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Why Random Forest?
  • 7. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Use Case - Credit Risk Detection  To minimize loss, the bank needs a decision rule to predict whom to give approval of the loan.  An applicant’s demographic (income, debts, credit history) and socio-economic profiles are considered.  Data science can help banks recognize behavior patterns and provide a complete view of individual customers.
  • 8. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Use Case - Credit Risk Detection student Risk Credit history Bank Balance age Risk No Risk No RiskRisk Final outcome
  • 9. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What is Random Forest?
  • 10. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What Is Random Forest?  Random Forest - a versatile algorithm capable of performing both i) Regression ii) Classification  It is a type of ensemble learning method  Commonly used predictive modelling and machine learning technique
  • 11. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest - Example Let’ say you want to decide if to watch “Edge of Tomorrow” or not. So you will decide based on following two actions. (i) You can ask your best friend (ii) You can ask bunch of friends.
  • 12. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest - Example To figure out if you will like “Edge of Tomorrow” or not, your friend will analyze a few things as: (i) If you like Adventure and Action (ii) If you like Emily Blunt Thus, a decision tree is created by your best friend. Ask best friend Genre - Adventure Yes Cast - Emily Blunt No Is Emily Blunt main lead? Like Don’t Like Yes No Like Don’t Like
  • 13. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest - Example In order to get more accurate recommendations, you will have to ask bunch of friends, say #Friend1, #Friend2, #Friend3 and consider their vote. Each one of them may take movies of different genre and further decide. The majority of the votes will decide the final outcome. Thus you build random forest of group of friends.
  • 14. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest - Example Friend 1 Top Gun Action movies Yes No Like Don’t Like Yes Like No Godzilla Don’t Like Friend 3 Far and Away Yes Oblivion Like No Like Friend 2 Tom Cruise
  • 15. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest Use Cases Banking Remote sensing Medicine Banking Identification of loan risk applicants by their probability of defaulting payments. Medicine Identification of at-risk patients and disease trends. Land Use Identification of areas of similar land use. Marketing Identifying customer churn. Use-cases Marketing
  • 16. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How Random Forest Works?
  • 17. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Random Forest Algorithm i.Randomly select m features from T; where 𝑚≪T i.For node d, calculate the best split point among the 𝑚 feature i.Split the node into two daughter nodes using the best split Repeat first three steps until 𝑛 number of nodes has been reached Build your forest by repeating steps i–iv for 𝐷 number of times  T: number of features  𝐷: number of trees to be constructed  𝑉: Output: the class with the highest vote
  • 18. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How Random Forest Works? Let’s take an example, We have taken dataset consisting of: • Weather information of last 14 days • Whether match was played or not on that particular day Now using the random forest we need to predict whether the game will happen if the weather condition is Outlook = Rain Humidity = High Wind = Weak Play = ?
  • 19. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How Random Forest Works?  The first step in Random forest is that it will divide the data into smaller subsets.  Every subsets need not be distinct, some subsets maybe overlapped
  • 20. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How Random Forest Works? D1,D2,D3 Overcast Wind Play No Play Play D7,D8,D9 Overcast Play No play Play Humidity D3,D4,D5,D6 Wind Overcast Play Wind Humidity PlayPlay No play No play Play Play
  • 21. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Features of Random Forest Most accurate learning algorithms Works well for both classification and regression problems Runs efficiently on large databases Requires almost no input preparation Performs implicit feature selection Can be easily grown in parallel Methods for balancing error in unbalanced data sets
  • 23. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What if we could predict the occurrence of diabetes and take appropriate measures beforehand to prevent it? Sure! Let me take you through the steps to predict the vulnerable patients.
  • 24. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation Doctor gets the following data from the medical history of the patient.
  • 25. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation We will divide our entire dataset into two subsets as: • Training dataset -> to train the model • Testing dataset -> to validate and make predictions
  • 26. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation Before we create random forest, let’s find out the best mtry value using following commands
  • 27. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation  Here, we implement random forest in R using following commands.
  • 28. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation  We get the output as follows
  • 29. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation Let’s see what all variables are most important for our model. For plotting the we can use the following commands As per MiniDecreaseGini value, glucose_conc is the most important variable in the model.
  • 30. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation Now, we can use our model to predict the output of our testing dataset. We can use the following code for predicting the output.  pred1_diabet<-predict(diabet_forest,newdata = diabet_test,type ="class")  pred1_diabet
  • 31. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation We get the following output for our testing dataset where: “YES” means the probability of patient being vulnerable to diabetes is positive “NO” means the probability of patient being vulnerable to diabetes is negative.
  • 32. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Model Validation  library(caret)  confusionMatrix(table(pred1_diabet,diabet_test$is_diabetic)) We can create confusion matrix for the model using the library caret to know how good is our model.
  • 33. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Demo Data Acquisition Divide dataset Implement model Visualize Accuracy = 79.66% The accuracy (or the overall success rate) is a metric defining the rate at which a model has classified the records correctly. A good model should have a high accuracy score Divide dataset Implement model Visualize Model Validation
  • 34. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Course Details Go to www.edureka.co/data-science Get Edureka Certified in Data Science Today! What our learners have to say about us! Shravan Reddy says- “I would like to recommend any one who wants to be a Data Scientist just one place: Edureka. Explanations are clean, clear, easy to understand. Their support team works very well.. I took the Data Science course and I'm going to take Machine Learning with Mahout and then Big Data and Hadoop”. Gnana Sekhar says - “Edureka Data science course provided me a very good mixture of theoretical and practical training. LMS pre recorded sessions and assignments were very good as there is a lot of information in them that will help me in my job. Edureka is my teaching GURU now...Thanks EDUREKA.” Balu Samaga says - “It was a great experience to undergo and get certified in the Data Science course from Edureka. Quality of the training materials, assignments, project, support and other infrastructures are a top notch.”

Editor's Notes

  1. Add photos