This document provides an overview of machine learning, including supervised learning, unsupervised learning, and reinforcement learning. It then discusses decision tree learning and decision trees in more detail. Decision tree algorithms like ID3 and C4.5 are explained as popular inductive inference algorithms that use an information gain measure to select attributes at each step of growing the decision tree. The document also covers converting decision trees to rules and splitting information. Linear models and artificial neural networks are briefly introduced, with the backpropagation algorithm explained as the gradient descent learning rule used in multilayer feedforward neural networks.
complete construction, environmental and economics information of biomass com...
Lec 18-19.pptx
1. 1
Machine Learning
It is a field of study that gives computers the ability to learn without being explicitly
programmed.
Forms of Machine learning
• Supervised Learning
⮚ Prior knowledge about class label
⮚ Common examples are Random Forest, Decision Tree, Naïve Bayes etc.
• Unsupervised Learning
⮚ No prior knowledge about class label
⮚ Common examples are K-means, Apriori etc.
Reinforcement Learning
⮚ Based on reward or penalty
⮚ Agent is able to perceive and interpret its environment, take actions and learn through
trial and error.
⮚ Common examples are Q-learning etc.
2. Decision Tree Learning
• Decision tree learning is a method for approximating discrete-valued target functions.
• The learned function is represented by a decision tree.
⮚ A learned decision tree can also be represented as a set of if-then rules.
• Decision tree learning is one of the most widely used and practical methods for
inductive inference.
• It is robust to noisy data and capable of learning disjunctive expressions.
• Decision tree learning method searches a completely expressive hypothesis.
⮚ Avoids the difficulties of restricted hypothesis spaces.
⮚ Its inductive bias is a preference for small trees over large trees.
• The decision tree algorithms such as ID3, C4.5 are very popular inductive inference
algorithms, and they successfully applied to many leaning tasks.
3. Decision Tree
• Decision Tree represents a disjunction of conjunctions of constraints on the
attributes values of instances.
• Each path from the tree root to a leaf corresponds to a conjuction of attribute
tests
• The tree itself is a disjunction of these conjunctions.
(Outlook = Sunny ˄ Humidity
= Normal)
˅ (Outlook = Overcast)
˅ (Outlook = Rain ˄ Wind
= Weak )
4. Decision Tree
• Decision tree classify instances by sorting them down the tree from the root to some
leaf node, which provides the classification of the instance.
• Each node in the tree specifies a test of some attributes of the instance.
• Each branch descending from a node corresponds to one of the possible values for the
attribute.
• Each leaf node assigns a classification.
• The instance
(Outlook = Sunny, Temperature = Hot, Humidity = High, Wind =
Strong) is classified as a negative instance.
4
5. Which Attribute is “best”?
• We would like to select the attribute that is most useful for classifying examples.
• Information Gain measures how well a given attribute separates the training examples
according to their target classification.
• ID3 uses this information gain measure to select among the candidate attribute at
each step while growing the tree.
• In order to define information gain precisely, we use a measure commonly used in
information theory, called entropy
• Entropy characterizes the (im)purity of an arbitrary collection of examples.
5
18. Linear Models
A strong high-bias assumption is linear separability:
– in 2 dimensions, can separate classes by a line
– in higher dimensions, need hyperplanes
A linear model is a model that assumes the data is linearly separable
18
19. A linear model in n-dimensional space (i.e. n features) is define by n+1
weights:
In two dimensions, a line:
In three dimensions, a plane:
In m-dimensions, a hyperplane
19
(where b = -
a)
20. Artificial Neural Network (ANN)
• Artificial Neural Network (ANNs) are programs designed to solve any problem by
trying to mimic the structure and the function of our nervous system.
• Neural networks are based on simulated neuron, which are joined together in a
variety of ways to form networks.
• Neural network resembles human brain in the following two ways
⮚ A neural network acquires knowledge through learning
⮚ A neural network’s knowledge is stored within the interconnection strengths known as
synaptic weights.
20
26. Backpropagation Algorithm
• The backpropagation algorithm (Rumelhart and McClelland,1986) is used in
layered feed-forward Artificial Neural Networks.
• Backpropagation is a multi-layer feed forward, supervised learning network based
on gradient descent learning rule.
• We provide the algorithm with examples of the inputs and outputs we want the
network to compute, and then the error (difference between actual and expected
results) is calculated.
• The idea of the backpropagation algorithm is to reduce this error, until the
Artificial Neural Network learns the training data.
26
28. • The backpropogation algorithm now calculates the error depends on the output,
inputs and weights.
• The adjustment of each weight(Δwji) will be the negative of a constant eta (η)
multiplied by the dependence of the “wji” previous weights on the error of the
network.
• First, we need to calculate how much the error depends on the output
• Next, how much the output depends on the activation, which in turn depends
weights
• And so, the adjustment to each weight will be
28