Lec 18-19.pptx

1
Machine Learning
It is a field of study that gives computers the ability to learn without being explicitly
programmed.
Forms of Machine learning
• Supervised Learning
⮚ Prior knowledge about class label
⮚ Common examples are Random Forest, Decision Tree, Naïve Bayes etc.
• Unsupervised Learning
⮚ No prior knowledge about class label
⮚ Common examples are K-means, Apriori etc.
Reinforcement Learning
⮚ Based on reward or penalty
⮚ Agent is able to perceive and interpret its environment, take actions and learn through
trial and error.
⮚ Common examples are Q-learning etc.

Decision Tree Learning
• Decision tree learning is a method for approximating discrete-valued target functions.
• The learned function is represented by a decision tree.
⮚ A learned decision tree can also be represented as a set of if-then rules.
• Decision tree learning is one of the most widely used and practical methods for
inductive inference.
• It is robust to noisy data and capable of learning disjunctive expressions.
• Decision tree learning method searches a completely expressive hypothesis.
⮚ Avoids the difficulties of restricted hypothesis spaces.
⮚ Its inductive bias is a preference for small trees over large trees.
• The decision tree algorithms such as ID3, C4.5 are very popular inductive inference
algorithms, and they successfully applied to many leaning tasks.

Decision Tree
• Decision Tree represents a disjunction of conjunctions of constraints on the
attributes values of instances.
• Each path from the tree root to a leaf corresponds to a conjuction of attribute
tests
• The tree itself is a disjunction of these conjunctions.
(Outlook = Sunny ˄ Humidity
= Normal)
˅ (Outlook = Overcast)
˅ (Outlook = Rain ˄ Wind
= Weak )

Decision Tree
• Decision tree classify instances by sorting them down the tree from the root to some
leaf node, which provides the classification of the instance.
• Each node in the tree specifies a test of some attributes of the instance.
• Each branch descending from a node corresponds to one of the possible values for the
attribute.
• Each leaf node assigns a classification.
• The instance
(Outlook = Sunny, Temperature = Hot, Humidity = High, Wind =
Strong) is classified as a negative instance.
4

Which Attribute is “best”?
• We would like to select the attribute that is most useful for classifying examples.
• Information Gain measures how well a given attribute separates the training examples
according to their target classification.
• ID3 uses this information gain measure to select among the candidate attribute at
each step while growing the tree.
• In order to define information gain precisely, we use a measure commonly used in
information theory, called entropy
• Entropy characterizes the (im)purity of an arbitrary collection of examples.
5

Which attribute is best classifier?
8

ID3 Training examples – [9+, 5-]
9

ID3 Selecting Next Attribute
10

11

12

Converting Decision Tree into Rules
16

Linear Models
A strong high-bias assumption is linear separability:
– in 2 dimensions, can separate classes by a line
– in higher dimensions, need hyperplanes
A linear model is a model that assumes the data is linearly separable
18

A linear model in n-dimensional space (i.e. n features) is define by n+1
weights:
In two dimensions, a line:
In three dimensions, a plane:
In m-dimensions, a hyperplane
19
(where b = -
a)

Artificial Neural Network (ANN)
• Artificial Neural Network (ANNs) are programs designed to solve any problem by
trying to mimic the structure and the function of our nervous system.
• Neural networks are based on simulated neuron, which are joined together in a
variety of ways to form networks.
• Neural network resembles human brain in the following two ways
⮚ A neural network acquires knowledge through learning
⮚ A neural network’s knowledge is stored within the interconnection strengths known as
synaptic weights.
20

Backpropagation Algorithm
• The backpropagation algorithm (Rumelhart and McClelland,1986) is used in
layered feed-forward Artificial Neural Networks.
• Backpropagation is a multi-layer feed forward, supervised learning network based
on gradient descent learning rule.
• We provide the algorithm with examples of the inputs and outputs we want the
network to compute, and then the error (difference between actual and expected
results) is calculated.
• The idea of the backpropagation algorithm is to reduce this error, until the
Artificial Neural Network learns the training data.
26

• The backpropogation algorithm now calculates the error depends on the output,
inputs and weights.
• The adjustment of each weight(Δwji) will be the negative of a constant eta (η)
multiplied by the dependence of the “wji” previous weights on the error of the
network.
• First, we need to calculate how much the error depends on the output
• Next, how much the output depends on the activation, which in turn depends
weights
• And so, the adjustment to each weight will be
28

Lec 18-19.pptx

Recommended

Recommended

More Related Content

Similar to Lec 18-19.pptx

Similar to Lec 18-19.pptx (20)

Recently uploaded

Recently uploaded (20)

Lec 18-19.pptx