Decision trees are a type of supervised learning algorithm used for classification and regression. ID3 and C4.5 are algorithms that generate decision trees by choosing the attribute with the highest information gain at each step. Random forest is an ensemble method that creates multiple decision trees and aggregates their results, improving accuracy. It introduces randomness when building trees to decrease variance.
Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.
This presentation was prepared as part of the curriculum studies for CSCI-659 Topics in Artificial Intelligence Course - Machine Learning in Computational Linguistics.
It was prepared under guidance of Prof. Sandra Kubler.
Basic of Decision Tree Learning. This slide includes definition of decision tree, basic example, basic construction of a decision tree, mathlab example
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6
This Decision Tree Algorithm in Machine Learning Presentation will help you understand all the basics of Decision Tree along with what Machine Learning is, what Machine Learning is, what Decision Tree is, the advantages and disadvantages of Decision Tree, how Decision Tree algorithm works with resolved examples, and at the end of the decision Tree use case/demo in Python for loan payment. For both beginners and experts who want to learn Machine Learning Algorithms, this Decision Tree tutorial is perfect.
Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.
This presentation was prepared as part of the curriculum studies for CSCI-659 Topics in Artificial Intelligence Course - Machine Learning in Computational Linguistics.
It was prepared under guidance of Prof. Sandra Kubler.
Basic of Decision Tree Learning. This slide includes definition of decision tree, basic example, basic construction of a decision tree, mathlab example
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6
This Decision Tree Algorithm in Machine Learning Presentation will help you understand all the basics of Decision Tree along with what Machine Learning is, what Machine Learning is, what Decision Tree is, the advantages and disadvantages of Decision Tree, how Decision Tree algorithm works with resolved examples, and at the end of the decision Tree use case/demo in Python for loan payment. For both beginners and experts who want to learn Machine Learning Algorithms, this Decision Tree tutorial is perfect.
This is the most simplest and easy to understand ppt. Here you can define what is decision tree,information gain,gini impurity,steps for making decision tree there pros and cons etc which will helps you to easy understand and represent it.
Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning.
Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.
Decision tree in artificial intelligenceMdAlAmin187
Decision tree.
Decision Tree that based on artificial intelligence. The main ideas behind Decision Trees were invented more than 70 years ago, and nowadays they are among the most powerful Machine Learning tools.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
attribute selection, constructing decision trees, decision trees, divide and conquer, entropy, gain ratio, information gain, machine leaning, pruning, rules, suprisal
This is the most simplest and easy to understand ppt. Here you can define what is decision tree,information gain,gini impurity,steps for making decision tree there pros and cons etc which will helps you to easy understand and represent it.
Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning.
Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.
Decision tree in artificial intelligenceMdAlAmin187
Decision tree.
Decision Tree that based on artificial intelligence. The main ideas behind Decision Trees were invented more than 70 years ago, and nowadays they are among the most powerful Machine Learning tools.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
attribute selection, constructing decision trees, decision trees, divide and conquer, entropy, gain ratio, information gain, machine leaning, pruning, rules, suprisal
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut
Its all about Machine learning .Machine learning is a field of artificial intelligence (AI) that focuses on the development of algorithms and statistical models that enable computers to perform tasks without explicit programming instructions. Instead, these algorithms learn from data, identifying patterns, and making decisions or predictions based on that data.
There are several types of machine learning approaches, including:
Supervised Learning: In this approach, the algorithm learns from labeled data, where each example is paired with a label or outcome. The algorithm aims to learn a mapping from inputs to outputs, such as classifying emails as spam or not spam.
Unsupervised Learning: Here, the algorithm learns from unlabeled data, seeking to find hidden patterns or structures within the data. Clustering algorithms, for instance, group similar data points together without any predefined labels.
Semi-Supervised Learning: This approach combines elements of supervised and unsupervised learning, typically by using a small amount of labeled data along with a large amount of unlabeled data to improve learning accuracy.
Reinforcement Learning: This paradigm involves an agent learning to make decisions by interacting with an environment. The agent receives feedback in the form of rewards or penalties, enabling it to learn the optimal behavior to maximize cumulative rewards over time.Machine learning algorithms can be applied to a wide range of tasks, including:
Classification: Assigning inputs to one of several categories. For example, classifying whether an email is spam or not.
Regression: Predicting a continuous value based on input features. For instance, predicting house prices based on features like square footage and location.
Clustering: Grouping similar data points together based on their characteristics.
Dimensionality Reduction: Reducing the number of input variables to simplify analysis and improve computational efficiency.
Recommendation Systems: Predicting user preferences and suggesting items or actions accordingly.
Natural Language Processing (NLP): Analyzing and generating human language text, enabling tasks like sentiment analysis, machine translation, and text summarization.
Machine learning has numerous applications across various domains, including healthcare, finance, marketing, cybersecurity, and more. It continues to be an area of active research and
This ppt is prepared for zeroth level presentation for the B - TECH project on the topic "Design and Implementation of Improved Authentication System for Android Smartphone Users". we also add the application of the upgraded locking system in lost phone detection procedure
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
3. Decision Trees
• Decision tree is a simple but powerful learning Paradigm.
• It is a type of classification algorithm for supervised learning.
• A decision tree is a tree in which each branch node represents
a choice between a number of alternatives and each leaf node
represents a decision.
• A node with outgoing edges is called an internal or test node.
All other nodes are called leaves (also known as terminal or
decision nodes).
5. Construction of decision tree
1) First test all attributes and select the one that will function
as the best root.
2) Break-up the training set into subsets based on the
branches of the root node.
3) Test the remaining attributes to see which ones fit best
underneath the branches of the root node.
4) Continue this process for all other branches until
6. Decision Trees
I. All the examples of a subset are of one type.
II. There are no examples left.
III. There are no more attributes left.
7. Decision Tree Example
• Given a collection of shape and colour example, learn a
decision tree that represents it. Use this representation to
classify new examples.
8. 8
• Decision Trees are classifiers for instances represented as
features vectors. (color = ;shape = ;label = )
• Nodes are tests for feature values;
• There is one branch for each value of the feature
• Leaves specify the categories (labels)
• Can categorize instances into multiple disjoint categories – multi-class
Color
Shape
Blue red Green
Shape
square
triangle circle circlesquare
AB
CA
B
B
Evaluation of a
Decision Tree
(color = RED ;shape = triangle)
Learning a
Decision Tree
Decision Trees: The Representation
9. 9
They can represent any Boolean function
• Can be rewritten as rules in Disjunctive Normal Form (DNF)
• green square positive
• blue circle positive
• blue square positive
• The disjunction of these rules is equivalent to the Decision Tree
Color
Shape
Blue red Green
Shape
square
triangle circle circlesquare
-+
++
-
+
Binary classified Decision Trees
10. Limitation of Decision Trees
• Learning a tree that classifies the training data perfectly may
not lead to the tree with the best generalization performance.
- There may be noise in the training data the tree is fitting
- The algorithm might be making decisions based on
very little data.
• A hypothesis h is said to overfit the training data if there is
another hypothesis, h’, such that ‘h’ has smaller error than h’
on the training data but h has larger error on the test data than h’.
12. Evaluation Methods for Decision Trees
• Two basic approaches
- Pre-pruning: Stop growing the tree at some point during
construction when it is determined that there is not enough
data to make reliable choices.
- Post-pruning: Grow the full tree and then remove nodes
that seem not to have sufficient evidence.
• Methods for evaluating subtrees to prune:
- Cross-validation: Reserve hold-out set to evaluate utility.
- Statistical testing: Test if the observed regularity can be
dismissed as likely to be occur by chance.
- Minimum Description Length: Is the additional complexity of
the hypothesis smaller than remembering the exceptions ?
13. ID3 (Iterative Dichotomiser 3)
• Invented by J.Ross Quinlan in 1975.
• Used to generate a decision tree from a given data set by
employing a top-down, greedy search, to test each attribute at
every node of the tree.
• The resulting tree is used to classify future samples.
14.
15.
16. TrainingSet Attributes Classes
Gender Car
Ownership
Travel Cost Income
Level
Transportation
Male 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Female 1 Cheap Medium Train
Female 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Male 0 Standard Medium Train
Female 1 Standard Medium Train
Female 1 Expensive High Car
Male 2 Expensive Medium Car
Female 2 Expensive High Car
17. Algorithm
• Calculate the Entropy of every attribute using the data set.
• Split the set into subsets using the attribute for which entropy is
minimum (or, equivalently, information gain is maximum).
• Make a decision tree node containing that attribute.
• Recurse on subsets using remaining attributes.
18. Entropy
• In order to define Information Gain precisely, we need to discuss
Entropy first.
• A formula to calculate the homogeneity of a sample.
• A completely homogeneous sample has entropy of 0 (leaf node).
• An equally divided sample has entropy of 1.
• The formula for entropy is:
Entropy(S) = −𝑝(𝐼) log2 𝑝(𝐼)
• where p(I) is the proportion of S belonging to class I. ∑ is over
total outcomes.
19. Entropy
• Example 1
If S is a collection of 14 examples with 9 YES and 5 NO
examples then
Entropy(S) = - (9/14) Log2 (9/14) - (5/14) Log2 (5/14)
= 0.940
20. Information Gain
• The information gain is based on the decrease in entropy after a
dataset is split on an attribute.
• The formula for calculating information gain is:
Gain(S, A) = Entropy(S) - ((|Sv| / |S|) * Entropy(Sv))
Where, Sv = subset of S for which attribute A has value v
• |Sv| = number of elements in Sv
• |S| = number of elements in S
21. Procedure
• First the entropy of the total dataset is calculated.
• The dataset is then split on the different attributes.
• The entropy for each branch is calculated.
• Then it is added proportionally, to get total entropy for the split.
• The resulting entropy is subtracted from the entropy before the
split.
• The result is the Information Gain, or decrease in entropy.
• The attribute that yields the largest IG is chosen for the decision
node.
23. TrainingSet Attributes Classes
Gender Car
Ownership
Travel Cost Income
Level
Transportation
Male 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Female 1 Cheap Medium Train
Female 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Male 0 Standard Medium Train
Female 1 Standard Medium Train
Female 1 Expensive High Car
Male 2 Expensive Medium Car
Female 2 Expensive High Car
24. Calculate the entropy of the total dataset
• First compute the Entropy of given training set.
Probability
Bus : 4/10 = 0.4
Train: 3/10 = 0.3
Car:3/10 = 0.3
E(S) = -P(I) log2 P(I)
E(S) = -(0.4) log2 (0.4) – (0.3)log2 (0.3) – (0.3)log2 (0.3) = 1.571
28. Split the dataset on ‘ownership’ attribute
Probability
Bus : 2/3= 0.6
Train: 1/3 = 0.3
Car:0/3 = 0
Entropy = 0.918
Probability
Bus : 2/5 = 0.4
Train: 2/5 = 0.4
Car:1/5 = 0.2
Entropy = 1.522
Probability
Bus : 0/2 = 0
Train: 0/2 = 0
Car:2/2 = 1
Entropy = 0
Gain(S,A) = E(S) – I(S,A)
I(S,A) = 0.918 *(3/10) +
1.522*(5/10) + 0*(2/10)
Gain(S,A) = 1.571 – 1.0364
= 0.534
29. • If we choose Travel Cost as splitting attribute,
-Entropy for Cheap = 0.722
Standard = 0
Expensive = 0
IG = 1.21
• If we choose Income Level as splitting attribute,
-Entropy for Low = 0
Medium = 1.459
High = 0
IG = 0.695
32. Iteration on Subset of Training Set
Probability
Bus : 4/5 = 0.8
Train: 1/5 = 0.2
Car:0/5 = 0
E(S) = -P(I) log2 P(I)
E(S) = -(0.8) log2 (0.8) – (0.2)log2 (0.2) = 0.722
33. • If we choose Gender as splitting attribute,
-Entropy for Male = 0
Female = 1
IG = 0.322
• If we choose Car Ownership as splitting attribute,
-Entropy for 0 = 0
1 = 0.918
IG = 0.171
• If we choose Income Level as splitting attribute,
-Entropy for Low = 0
Medium = 0.918
IG = 0.171
35. Diagram : Decision Tree
Travel
Cost
XpensivStandrdCheap
Train CarMale Female
Bus 0 1
Train Bus
36. Solution to Our Problem :
Name Gender Car
Ownership
Travel Cost Income
Level
Transportat
ion
Abhi Male 1 Standard High Train
Pavi Male 0 Cheap Medium Bus
Ammu Female 1 Cheap High Bus
37. Drawbacks of ID3
• Data may be over-fitted or over-classified, if a small sample is
tested.
• Only one attribute at a time is tested for making a decision.
• Classifying continuous data may be computationally
expensive.
• Does not handle numeric attributes and missing values.
38. C4.5 Algorithm
• This algorithm is also invented by J.Ross Quinlan., for generating
decision tree.
• It can be treated as an extension of ID3.
• C4.5 is mainly used for classification of data.
• It is called statistical classifier because of its potential for handling noisy
data.
39. Construction
• Base cases are determined.
• For each attribute a, the normalized information gain is found
out by splitting on a.
• Let a-best be the attribute with the highest normalized
information gain.
• Create a decision node that split on a-best
• Recur on the sub-lists obtained by splitting on a-best, and add
those nodes as children of node.
40. • The new Features include,
(i) accepts both continuous and discrete features.
(ii) handles incomplete data points.
(iii) solves over-fitting problem by bottom-up technique
usually known as "pruning”.
(iv) different weights can be applied the features that comprise
the training data.
41. Application based on features
• Continuous attribute categorisation.
discretisation – It partitions the continuous attribute values
into discrete set of intervals.
• Handling missing values.
Uses the theory of probability – probability is calculated from
the observed frequencies, takes the best probable value after
computing the frequencies.
42. Random Forest
• Random forest is an ensemble classifier which uses many
decision trees.
• It builds multiple decision trees and merges them together to
get a more accurate and stable prediction.
• It can be used for both classification and regression problems.
43. Algorithm
1) Create a bootstrapped dataset.
2) Create a decision tree using the bootstrapped dataset, but
only use a random subset of variables(or columns) at each
step.
3) Go back to step 1 and repeat.
Using a bootstrapped sample and considering only a subset of
variables at each step results in a wide variety of trees.
44. Explanation
• First, the process starts with the randomisation of the data sets
(done by bagging)
• Bagging – helps to reduce the variance, helps in effective
classification.
• RF contains many classification trees.
• Bagging – Bootstrapping the data plus using the aggregate to
make a decision.
• Out-of-Bag-Dataset – The dataset formed by using un chosen
tuples from the training set during the process of bootstrapping.
45. Gini Index
• Random Forest uses the gini index taken from the CART learning
system to construct decision trees. The gini index of node
impurity is the measure most commonly chosen for
classification-type problems. If a dataset T contains examples
from n classes.
• gini index, Gini(T) is defined as:
Gini(T) = 1 - ∑ 𝑗=1
𝑛
𝑃𝑗
2
where pj is the relative frequency of class j in T
46. • If a dataset T is split into two subsets T1 and T2 with sizes N1
and N2 respectively, the gini index of the split data contains
examples from n classes, the gini index (T) is defined as:
• Ginisplit(T) = N1/Ngini(T1) + N2/Ngini(T2)
48. Advantages
• It produces a highly accurate classifier and learning is fast
• It runs efficiently on large data bases.
• It can handle thousands of input variables without variable
deletion.
• It computes proximities between pairs of cases that can be
used in clustering, locating outliers or (by scaling) give
interesting views of the data.
• It offers an experimental method for detecting variable
interactions.
Editor's Notes
A classification technique (or classifier) is a systematic approach to building
classification models from an input data set. Examples include decision tree
classifiers, rule-based classifiers, neural networks, support vector machines,
and na ̈
ıve Bayes classifiers. Each technique employs a
learning algorithm
to identify a model that best fits the relationship between the attribute set and
class label of the input data. The model generated by a learning algorithm
should both fit the input data well and correctly predict the class labels of
records it has never seen before. Therefore, a key objective of the learning
algorithm is to build models with good generalization capability; i.e., models
that accurately predict the class labels of previously unknown records.
-Ensemble methods are learning algorithms that construct a. set of classifiers and then classify new data points by taking a (weighted) vote of their predictions.
-To create a bootstrapped dataset that is the same size as the original, we just randomly selects samples from the original dataset.