SlideShare a Scribd company logo
1 of 39
DECISION TREE, SOFTMAX 
REGRESSION AND ENSEMBLE 
METHODS IN MACHINE LEARNING 
- Abhishek Vijayvargia
WHAT IS MACHINE LEARNING 
 Formal Approach 
 Filed of study that gives computers the ability to learn 
without explicitly programmed. 
 Informal Approach
MACHINE LEARNING 
 Supervised Learning 
 Supervised learning is the machine learning task of 
inferring a function from labeled training data. 
 Approximation 
 Unsupervised Learning 
 Trying to find hidden structure in unlabeled data. 
 Examples given to the learner are unlabeled, there is no 
error or reward signal to evaluate a potential solution. 
 Shorter Description 
 Reinforcement learning 
 Learning by interacting with an environment
SUPERVISED LEARNING 
 Classification 
 Output variable takes class labels. 
 Ex. Predicting a mail is spam/ham 
 Regression 
 Output variable is numeric or continuous. 
 Ex. Measuring temperature
DECISION TREES 
 Is this restaurant good? 
 ( YES/ NO)
DECISION TREES 
 What are the factors which decide that restaurant is 
good for you or not? 
 Type : Italian, South Indian, French 
 Atmosphere: Casual, Fancy 
 How many people inside it? (10< people > 30 ) 
 Cost 
 Weather outside : Rainy, Sunny, Cloudy 
 Hungry : Yes/No
DECISION TREE 
Hungry 
True False 
Rainy 
People > 
10 
YES No 
YES 
Type 
Cost 
YES No 
No 
True 
False 
True 
False 
French South Indian 
More 
Less
DECISION TREE LEARNING 
 Pick best attribute 
 Make a decision tree node containing that attribute 
 For each value of decision node create a 
descendent of node 
 Sort training example to leaves 
 Iterate on subsets using remaining attributes
DECISION TREE : PICK BEST ATTRIBUTE 
True 
+ + - 
+ + - - 
+ -+- 
False 
- - + - 
+ - + 
+ - - + 
True 
+ - + - 
+ + + 
- - - + 
False 
True 
+ + + 
+ + 
False 
- - - - 
- - - 
Graph. 1 Graph. 2 Graph. 3
DECISION TREE : PICK BEST ATTRIBUTE 
 Select the attribute which gives MAXIMUM Information 
Gain. 
 Gain measures how well a given attribute separates 
training examples into targeted classes. 
 Entropy is a measure of the amount of uncertainty in the 
(data) set. 
H(S) = − 푥∈푋 푝(푥) log2 푝(푥) 
S: Current data set for which entropy is calculated. 
X: Set of classes in X. 
p(x) : The proportion of the number of elements in class to 
the number of elements in set.
DECISION TREE : INFORMATION GAIN 
 Information gain IG(A) is the measure of the 
difference in entropy from before to after the set S 
is split on an attribute A. 
 In other words, how much uncertainty in S was 
reduced after splitting set S on attribute A. 
IG(A,S) = H(S) - 푡∈푇 푝 푡 퐻(푡) 
H(S) : Entropy of set S 
T : The subsets created from splitting set S by 
attribute A such that S = 푡∈푇 푡 
p(t) : The proportion of the number of elements in t to 
the number of elements in set S
DECISION TREE ALGORITHM : BIAS 
 Restriction Bias : All type of possible decision tree. 
 Preference Bias : Which decision tree algorithm 
prefer? 
 Good split at TOP 
 Correct over Incorrect 
 Shorter tree
DECISION TREE : CONTINUOUS ATTRIBUTE 
 Branch on number of possible values? 
 Include age only in training set? 
 Useless when we get some age not present in training 
set 
 Represent in the form of range 
Age 
1.11 1.111 
20<=Age<30
DECISION TREE : CONTINUOUS ATTRIBUTE 
 Does it make sense to repeat an attribute along a 
path in the tree? 
B 
A A 
A 
B 
A
DECISION TREE : WHEN DO WE STOP? 
 Everything classified correctly? (same example/ 
noisy two answer for same) 
 No more attribute? ( not good for continuous 
attribute/ infinite possibility) 
 Pruning
SOFTMAX REGRESSION 
 Softmax Regression ( or multinomial logistic 
regression) is a classification method that 
generalizes logistic regression to multiclass 
problems. (i.e. with more than two possible discrete 
outcomes.) 
 Used to predict the probabilities of the different 
possible outcomes of a categorically distributed 
dependent variable, given a set of independent 
variables (which may be real-valued, binary-valued, 
categorical-valued, etc.).
LOGISTIC REGRESSION 
 Logistic regression is used to refer specifically to 
the problem in which the dependent variable is 
binary ( only two categories). 
 As output variable y ∈ 0,1 , it seems natural to 
choose Bernoulli family of distribution to model 
conditional distribution of y given x. 
 Logistic function (which always takes on values 
between zero and one) 
퐹 푡 = 1 
1+푒−푡 = 1 
푒−휃푇푥
SOFTMAX REGRESSION 
 Used in classification problem in which response 
variable y can take on any one of k values. 
 푦 ∈ 1,2, … , 푘 . 
 Ex. Classify emails into three classes { Primary, 
Social, Promotions } 
 Response variable is still discrete but can take 
more than two values. 
 To derive General Linear Model for multinomial data 
we begin by expressing the multinomial as an 
exponential family distribution.
SOFTMAX REGRESSION 
 To parameterize a multinomial over k-possible 
outcomes, we could use k parameters ∅1, … , ∅푘 
specifying probability of each outcomes. 
푘 ∅푖 = 
 These parameters are redundant because 푖=1 
1. So ∅푖 = 푝 푦 = 푖; ∅ 
푘 ∅푖 
 and 푝(푦 = 푘; ∅) = 1 − 푖=1 
 Indicator Function 1{.} takes a value of 1 if it’s 
argument is true, and 0 otherwise. 
 1{True} = 1, 1{False} = 0.
SOFTMAX REGRESSION 
 Multinomial is member of exponential family. 
1{푦=1} ∅2 
푝 푦; ∅ = ∅1 
1{푦=2} … … . ∅푘 
1{푦=푘} 
1{푦=1} ∅2 
= ∅1 
1− 푖=1 
1{푦=2} … … . ∅푘 
푘−1{푦=푖} 
=푏 푦 exp 휔푇 푇 푦 − a ω 
Where 휔 = 
log ∅ 1 ∅푘 
log ∅ 2 ∅푘 
⋮ 
log ∅ 푘 − 1 ∅푘 
푎 휔 = − log ∅푘 
푏 푦 = 1 푇 푦 ∈ 푅푘 
_1
SOFTMAX REGRESSION 
 The link function is given as 
휔푖 = log 
∅푖 
∅푘 
To invert the link function and derive the response 
function 
푒휔푖 = 
∅푖 
∅푘 
∅푘푒휔푖 = ∅푖 
∅푘 
푘 
푖=1 
푒휔푖 = 
푘 
푖=1 
∅푖 = 1
SOFTMAX REGRESSION 
 So we get ∅푘= 1 
푘 푒휔 
푖=1 
푖 
we can substitute it back in 
the equation to give response function 
∅푖= 
푒휔 
푖 
푘 푒휔 
푖=1 
푖 
 Conditional distribution of y given x is given by 
푝 푦 = 푖 푥; 휃 = 휔푖 
= 
푒휔 
푖 
푘 푒휔 
푖=1 
푖 
= 
푒−휃푖푇 
푥 
푖 
푘 푒−휃푇 
푖=1 
푥 
푖
SOFTMAX REGRESSION 
 Softmax regression is a generalization of logistic 
regression. 
 Our Hypothesis will output 
ℎ휃 푥 = 
∅1 
∅2 
⋮ 
∅푘 
 In other words, our hypothesis will output the 
estimated probability 푝 푦 = 푖 푥; 휃 for every value of 
i = 1, .. k.
ENSEMBLE LEARNING 
 Ensemble learning use multiple learning algorithms 
to obtain better predictive performance than could 
be obtained from any of the constituent learning 
algorithms. 
 Ensemble learning is primarily used to improve the 
prediction performance of a model, or reduce the 
likelihood of an unfortunate selection of a poor one.
HOW GOOD ARE ENSEMBLES? 
 Let’s look at NetFlix Prize Competition…
NETFLIX PRIZE : STARTED IN OCT 2006 
 Supervised Learning Task 
 Training Data is a set of users and rating (1,2,3,4,5 
stars) those users have given to movies. 
 Construct a classifier that given a user and an unrated 
movie, correctly classified that movie as either 1,2,3,4 or 
5 stars. 
 $1 Million prize for a 10% improvement over Netflix 
current movie recommender/Classifier.
NETFLIX PRIZE : LEADER BOARD
ENSEMBLE LEARNING : GENERAL IDEA
ENSEMBLE LEARNING : BAGGING 
 Given : 
 Training Set of N examples 
 A class of learning models ( decision tree, NB, SVM,RF 
etc. ) 
 Training : 
 At each iteration I a training set Si of N tuples is 
sampled with replacement from S. 
 A classifier model Mi is learned for each training set Si. 
 Classification : Classify an unknown sample x 
 Each classifier Mi returns it’s class prediction. 
 The bagged classifier M* count the votes and assign the 
class with the most votes.
ENSEMBLE LEARNING : BAGGING 
 Bagging reduces variance by voting/averaging. 
 Can help a lot when data is noisy. 
 If learning algorithm is unstable, then Bagging 
almost always improves performance.
ENSEMBLE LEARNING : RANDOM FORESTS 
 Random Forests grows many classification trees. 
 To classify a new object from an input vector, put 
the input vector down each of the trees in the 
forest. 
 Each tree gives a classification, and we say the tree 
"votes" for that class. 
 The forest chooses the classification having the 
most votes (over all the trees in the forest).
ENSEMBLE LEARNING : RANDOM FORESTS 
 Each tree is grown as follows: 
 If the number of cases in the training set is N, 
sample N cases at random - but with replacement, 
from the original data. This sample will be the 
training set for growing the tree. 
 If there are M input variables, a number m<<M is 
specified such that at each node, m variables are 
selected at random out of the M and the best split 
on these m is used to split the node. The value of m 
is held constant during the forest growing. 
 Each tree is grown to the largest extent possible. 
There is no pruning.
FEATURES OF RANDOM FORESTS 
 Better in accuracy among current algorithms. 
 Runs efficiently on large data bases. 
 It can handle thousands of input variables without 
variable deletion. 
 It gives estimates of what variables are important in 
the classification. 
 Effective method for estimating missing data and 
maintains accuracy when a large proportion of the 
data are missing. 
 Generated forests can be saved for future use on 
other data.
ENSEMBLE LEARNING : BOOSTING 
 Create a sequence of classifiers, giving higher 
influence to more accurate classifiers. 
 At each iteration, make examples currently 
misclassified more important( get larger weight in 
the construction of the next classifier) 
 Then combine classifier by weighted vote (weight 
given by classifier accuracy)
ENSEMBLE LEARNING : BOOSTING 
 Suppose there are just 7 training examples 
{1,2,3,4,5,6,7} 
 Initially each example has a 0.142 (1/7) probability of 
being sampled. 
 1st round of boosting samples ( with replacement) 7 
examples { 3,5,5,4,6,7,3} and build a classifier from 
them. 
 Suppose examples {2,3,4,6,7} are correctly predicted by 
this classifier and examples {1,5} are wrongly predicted: 
 Weight of examples {1,5} are increased. 
 Weight of examples {2,3,4,6,7} are decreased. 
 2nd round of boosting again take 7 examples, but now 
examples {1,5} are more likely to be sampled. 
 And so on until some convergence is achieved.
ENSEMBLE LEARNING : BOOSTING 
 Weights models according to performance. 
 Encourage new model to become an “expert” for 
instances misclassified by earlier model. 
 Combines “Weak Learner” to generate “strong 
learner”.
ENSEMBLE LEARNING 
 Netflix 1st prize winner gradient boosted decision 
tree. 
 http://www.netflixprize.com/assets/GrandPrize2009 
_BPC_BellKor.pdf
THANK YOU FOR YOUR ATTENTION
 Ask Question to narrow down possiblity 
 Informatica building example 
 Mango machine learning 
 Cannot look all trees

More Related Content

What's hot

Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Linear models and multiclass classification
Linear models and multiclass classificationLinear models and multiclass classification
Linear models and multiclass classificationNdSv94
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learningSANTHOSH RAJA M G
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning AlgorithmsDezyreAcademy
 
Machine Learning Project
Machine Learning ProjectMachine Learning Project
Machine Learning ProjectAbhishek Singh
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
Feature selection
Feature selectionFeature selection
Feature selectionDong Guo
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Simplilearn
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Pradeep Redddy Raamana
 
Feature selection
Feature selectionFeature selection
Feature selectiondkpawar
 
Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.Megha Sharma
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning TechniquesBabu Priyavrat
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniquesVenkata Reddy Konasani
 

What's hot (20)

Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Linear models and multiclass classification
Linear models and multiclass classificationLinear models and multiclass classification
Linear models and multiclass classification
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 
Machine Learning Project
Machine Learning ProjectMachine Learning Project
Machine Learning Project
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Classification Algorithm.
Classification Algorithm.Classification Algorithm.
Classification Algorithm.
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 

Viewers also liked

PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODSPREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODSBilal Nizami
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedSri Ambati
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Marina Santini
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Longhow Lam
 

Viewers also liked (6)

PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODSPREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
PREDICTION OF ANTIMICROBIAL PEPTIDES USING MACHINE LEARNING METHODS
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
 
Machine learning overview (with SAS software)
Machine learning overview (with SAS software)Machine learning overview (with SAS software)
Machine learning overview (with SAS software)
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Xgboost
XgboostXgboost
Xgboost
 

Similar to Machine Learning Methods Comparison: Decision Trees, Softmax Regression, Ensemble Techniques

BaggingBoosting.pdf
BaggingBoosting.pdfBaggingBoosting.pdf
BaggingBoosting.pdfDynamicPitch
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos butest
 
Download presentation source
Download presentation sourceDownload presentation source
Download presentation sourcebutest
 
Boosting - An Ensemble Machine Learning Method
Boosting - An Ensemble Machine Learning MethodBoosting - An Ensemble Machine Learning Method
Boosting - An Ensemble Machine Learning MethodKirkwood Donavin
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
ensemble learning
ensemble learningensemble learning
ensemble learningbutest
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009Matthew Magistrado
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2arogozhnikov
 
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsJason Tsai
 

Similar to Machine Learning Methods Comparison: Decision Trees, Softmax Regression, Ensemble Techniques (20)

BaggingBoosting.pdf
BaggingBoosting.pdfBaggingBoosting.pdf
BaggingBoosting.pdf
 
Machine Learning_PPT.pptx
Machine Learning_PPT.pptxMachine Learning_PPT.pptx
Machine Learning_PPT.pptx
 
Data mining
Data miningData mining
Data mining
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision tree
Decision treeDecision tree
Decision tree
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
Download presentation source
Download presentation sourceDownload presentation source
Download presentation source
 
Boosting - An Ensemble Machine Learning Method
Boosting - An Ensemble Machine Learning MethodBoosting - An Ensemble Machine Learning Method
Boosting - An Ensemble Machine Learning Method
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
ensemble learning
ensemble learningensemble learning
ensemble learning
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009
 
Decision tree
Decision tree Decision tree
Decision tree
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
 
Machine learning in science and industry — day 2
Machine learning in science and industry — day 2Machine learning in science and industry — day 2
Machine learning in science and industry — day 2
 
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
 

Recently uploaded

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 

Recently uploaded (20)

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 

Machine Learning Methods Comparison: Decision Trees, Softmax Regression, Ensemble Techniques

  • 1. DECISION TREE, SOFTMAX REGRESSION AND ENSEMBLE METHODS IN MACHINE LEARNING - Abhishek Vijayvargia
  • 2. WHAT IS MACHINE LEARNING  Formal Approach  Filed of study that gives computers the ability to learn without explicitly programmed.  Informal Approach
  • 3. MACHINE LEARNING  Supervised Learning  Supervised learning is the machine learning task of inferring a function from labeled training data.  Approximation  Unsupervised Learning  Trying to find hidden structure in unlabeled data.  Examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution.  Shorter Description  Reinforcement learning  Learning by interacting with an environment
  • 4. SUPERVISED LEARNING  Classification  Output variable takes class labels.  Ex. Predicting a mail is spam/ham  Regression  Output variable is numeric or continuous.  Ex. Measuring temperature
  • 5. DECISION TREES  Is this restaurant good?  ( YES/ NO)
  • 6. DECISION TREES  What are the factors which decide that restaurant is good for you or not?  Type : Italian, South Indian, French  Atmosphere: Casual, Fancy  How many people inside it? (10< people > 30 )  Cost  Weather outside : Rainy, Sunny, Cloudy  Hungry : Yes/No
  • 7. DECISION TREE Hungry True False Rainy People > 10 YES No YES Type Cost YES No No True False True False French South Indian More Less
  • 8. DECISION TREE LEARNING  Pick best attribute  Make a decision tree node containing that attribute  For each value of decision node create a descendent of node  Sort training example to leaves  Iterate on subsets using remaining attributes
  • 9. DECISION TREE : PICK BEST ATTRIBUTE True + + - + + - - + -+- False - - + - + - + + - - + True + - + - + + + - - - + False True + + + + + False - - - - - - - Graph. 1 Graph. 2 Graph. 3
  • 10. DECISION TREE : PICK BEST ATTRIBUTE  Select the attribute which gives MAXIMUM Information Gain.  Gain measures how well a given attribute separates training examples into targeted classes.  Entropy is a measure of the amount of uncertainty in the (data) set. H(S) = − 푥∈푋 푝(푥) log2 푝(푥) S: Current data set for which entropy is calculated. X: Set of classes in X. p(x) : The proportion of the number of elements in class to the number of elements in set.
  • 11. DECISION TREE : INFORMATION GAIN  Information gain IG(A) is the measure of the difference in entropy from before to after the set S is split on an attribute A.  In other words, how much uncertainty in S was reduced after splitting set S on attribute A. IG(A,S) = H(S) - 푡∈푇 푝 푡 퐻(푡) H(S) : Entropy of set S T : The subsets created from splitting set S by attribute A such that S = 푡∈푇 푡 p(t) : The proportion of the number of elements in t to the number of elements in set S
  • 12. DECISION TREE ALGORITHM : BIAS  Restriction Bias : All type of possible decision tree.  Preference Bias : Which decision tree algorithm prefer?  Good split at TOP  Correct over Incorrect  Shorter tree
  • 13. DECISION TREE : CONTINUOUS ATTRIBUTE  Branch on number of possible values?  Include age only in training set?  Useless when we get some age not present in training set  Represent in the form of range Age 1.11 1.111 20<=Age<30
  • 14. DECISION TREE : CONTINUOUS ATTRIBUTE  Does it make sense to repeat an attribute along a path in the tree? B A A A B A
  • 15. DECISION TREE : WHEN DO WE STOP?  Everything classified correctly? (same example/ noisy two answer for same)  No more attribute? ( not good for continuous attribute/ infinite possibility)  Pruning
  • 16. SOFTMAX REGRESSION  Softmax Regression ( or multinomial logistic regression) is a classification method that generalizes logistic regression to multiclass problems. (i.e. with more than two possible discrete outcomes.)  Used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).
  • 17. LOGISTIC REGRESSION  Logistic regression is used to refer specifically to the problem in which the dependent variable is binary ( only two categories).  As output variable y ∈ 0,1 , it seems natural to choose Bernoulli family of distribution to model conditional distribution of y given x.  Logistic function (which always takes on values between zero and one) 퐹 푡 = 1 1+푒−푡 = 1 푒−휃푇푥
  • 18. SOFTMAX REGRESSION  Used in classification problem in which response variable y can take on any one of k values.  푦 ∈ 1,2, … , 푘 .  Ex. Classify emails into three classes { Primary, Social, Promotions }  Response variable is still discrete but can take more than two values.  To derive General Linear Model for multinomial data we begin by expressing the multinomial as an exponential family distribution.
  • 19. SOFTMAX REGRESSION  To parameterize a multinomial over k-possible outcomes, we could use k parameters ∅1, … , ∅푘 specifying probability of each outcomes. 푘 ∅푖 =  These parameters are redundant because 푖=1 1. So ∅푖 = 푝 푦 = 푖; ∅ 푘 ∅푖  and 푝(푦 = 푘; ∅) = 1 − 푖=1  Indicator Function 1{.} takes a value of 1 if it’s argument is true, and 0 otherwise.  1{True} = 1, 1{False} = 0.
  • 20. SOFTMAX REGRESSION  Multinomial is member of exponential family. 1{푦=1} ∅2 푝 푦; ∅ = ∅1 1{푦=2} … … . ∅푘 1{푦=푘} 1{푦=1} ∅2 = ∅1 1− 푖=1 1{푦=2} … … . ∅푘 푘−1{푦=푖} =푏 푦 exp 휔푇 푇 푦 − a ω Where 휔 = log ∅ 1 ∅푘 log ∅ 2 ∅푘 ⋮ log ∅ 푘 − 1 ∅푘 푎 휔 = − log ∅푘 푏 푦 = 1 푇 푦 ∈ 푅푘 _1
  • 21. SOFTMAX REGRESSION  The link function is given as 휔푖 = log ∅푖 ∅푘 To invert the link function and derive the response function 푒휔푖 = ∅푖 ∅푘 ∅푘푒휔푖 = ∅푖 ∅푘 푘 푖=1 푒휔푖 = 푘 푖=1 ∅푖 = 1
  • 22. SOFTMAX REGRESSION  So we get ∅푘= 1 푘 푒휔 푖=1 푖 we can substitute it back in the equation to give response function ∅푖= 푒휔 푖 푘 푒휔 푖=1 푖  Conditional distribution of y given x is given by 푝 푦 = 푖 푥; 휃 = 휔푖 = 푒휔 푖 푘 푒휔 푖=1 푖 = 푒−휃푖푇 푥 푖 푘 푒−휃푇 푖=1 푥 푖
  • 23. SOFTMAX REGRESSION  Softmax regression is a generalization of logistic regression.  Our Hypothesis will output ℎ휃 푥 = ∅1 ∅2 ⋮ ∅푘  In other words, our hypothesis will output the estimated probability 푝 푦 = 푖 푥; 휃 for every value of i = 1, .. k.
  • 24. ENSEMBLE LEARNING  Ensemble learning use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms.  Ensemble learning is primarily used to improve the prediction performance of a model, or reduce the likelihood of an unfortunate selection of a poor one.
  • 25. HOW GOOD ARE ENSEMBLES?  Let’s look at NetFlix Prize Competition…
  • 26. NETFLIX PRIZE : STARTED IN OCT 2006  Supervised Learning Task  Training Data is a set of users and rating (1,2,3,4,5 stars) those users have given to movies.  Construct a classifier that given a user and an unrated movie, correctly classified that movie as either 1,2,3,4 or 5 stars.  $1 Million prize for a 10% improvement over Netflix current movie recommender/Classifier.
  • 27. NETFLIX PRIZE : LEADER BOARD
  • 28. ENSEMBLE LEARNING : GENERAL IDEA
  • 29. ENSEMBLE LEARNING : BAGGING  Given :  Training Set of N examples  A class of learning models ( decision tree, NB, SVM,RF etc. )  Training :  At each iteration I a training set Si of N tuples is sampled with replacement from S.  A classifier model Mi is learned for each training set Si.  Classification : Classify an unknown sample x  Each classifier Mi returns it’s class prediction.  The bagged classifier M* count the votes and assign the class with the most votes.
  • 30. ENSEMBLE LEARNING : BAGGING  Bagging reduces variance by voting/averaging.  Can help a lot when data is noisy.  If learning algorithm is unstable, then Bagging almost always improves performance.
  • 31. ENSEMBLE LEARNING : RANDOM FORESTS  Random Forests grows many classification trees.  To classify a new object from an input vector, put the input vector down each of the trees in the forest.  Each tree gives a classification, and we say the tree "votes" for that class.  The forest chooses the classification having the most votes (over all the trees in the forest).
  • 32. ENSEMBLE LEARNING : RANDOM FORESTS  Each tree is grown as follows:  If the number of cases in the training set is N, sample N cases at random - but with replacement, from the original data. This sample will be the training set for growing the tree.  If there are M input variables, a number m<<M is specified such that at each node, m variables are selected at random out of the M and the best split on these m is used to split the node. The value of m is held constant during the forest growing.  Each tree is grown to the largest extent possible. There is no pruning.
  • 33. FEATURES OF RANDOM FORESTS  Better in accuracy among current algorithms.  Runs efficiently on large data bases.  It can handle thousands of input variables without variable deletion.  It gives estimates of what variables are important in the classification.  Effective method for estimating missing data and maintains accuracy when a large proportion of the data are missing.  Generated forests can be saved for future use on other data.
  • 34. ENSEMBLE LEARNING : BOOSTING  Create a sequence of classifiers, giving higher influence to more accurate classifiers.  At each iteration, make examples currently misclassified more important( get larger weight in the construction of the next classifier)  Then combine classifier by weighted vote (weight given by classifier accuracy)
  • 35. ENSEMBLE LEARNING : BOOSTING  Suppose there are just 7 training examples {1,2,3,4,5,6,7}  Initially each example has a 0.142 (1/7) probability of being sampled.  1st round of boosting samples ( with replacement) 7 examples { 3,5,5,4,6,7,3} and build a classifier from them.  Suppose examples {2,3,4,6,7} are correctly predicted by this classifier and examples {1,5} are wrongly predicted:  Weight of examples {1,5} are increased.  Weight of examples {2,3,4,6,7} are decreased.  2nd round of boosting again take 7 examples, but now examples {1,5} are more likely to be sampled.  And so on until some convergence is achieved.
  • 36. ENSEMBLE LEARNING : BOOSTING  Weights models according to performance.  Encourage new model to become an “expert” for instances misclassified by earlier model.  Combines “Weak Learner” to generate “strong learner”.
  • 37. ENSEMBLE LEARNING  Netflix 1st prize winner gradient boosted decision tree.  http://www.netflixprize.com/assets/GrandPrize2009 _BPC_BellKor.pdf
  • 38. THANK YOU FOR YOUR ATTENTION
  • 39.  Ask Question to narrow down possiblity  Informatica building example  Mango machine learning  Cannot look all trees