SlideShare a Scribd company logo
1 of 168
Noida Institute of Engineering and Technology, Greater Noida
ML Classifiers
Faculty Details:
Dr. Laxman Singh
Associate professor
ECE (AI) Department
1/22/2023
1
Unit: 5
Machine Learning (AEC0516) unit-5
Subject Name:
Machine Learning (AEC0516)
Course Details:
B. Tech (V SEM)
Noida Institute of Engineering and Technology, Greater Noida
Evaluation Scheme (EC-Vth Semester)
4
Elective Subjects
Machine Learning Unit 1
5
Syllabus
Machine Learning Unit 1
6
Syllabus
Machine Learning Unit 1
7
Applications
Machine Learning Unit 1
8
Course Objectives
Course
Name
Machine Learning (KEC-503)
Year : Third Year / Fifth Semester
KEC
503.1
The machine learning and basics of statistics and
probability theory.
KEC
503.2
Neurons, neural networks, and multilayer perceptron.
KEC
503.3
Identification of the dimensionality of data and its
reduction using various mathematical concepts as well as
probabilistic learning.
Machine Learning Unit 1
9
Course Objectives
Course
Name
Machine Learning (KEC-503)
Year : Third Year / Fifth Semester
KEC
503.4
Various search and optimization techniques to the raw
data.
KEC
503.5
Various learning techniques and approaches.
Dr. Kumod kumar Gupta Machine Learning Unit 1
10
Course Outcomes (COs)
COUR
SE
OUTC
OME
NO.
COURSE OUTCOMES
After completion of this course, students will be able to
CO1
Describe the basic concepts of machine learning, statistics,
and probability theory.
CO2
Define and describe the Neurons, neural networks, and multilayer
perceptron.
CO3
Identify the dimensionality of data and reduces it using various
mathematical concepts as well as describe the probabilistic learning.
Machine Learning Unit 1
11
Course Outcomes (COs)
COURSE
OUTCOME NO.
COURSE OUTCOMES
After completion of this course, students will
be able to
CO4
Describe and apply various search and optimization
techniques to the raw data.
CO5 Illustrate and apply various learning techniques.
Machine Learning Unit 1
12
Program Outcomes
Machine Learning Unit 1
• Program Outcomes are narrow statements that describe what the students are
expected to know and would be able to do upon the graduation.
• These relate to the skills, knowledge, and behavior that students acquire through
the programmed.
1. Engineering knowledge
2. Problem analysis
3. Design/development of solutions
4. Conduct investigations of complex problems
5. Modern tool usage
6. The engineer and society
7. Environment and sustainability
8. Ethics
9. Individual and team work
10. Communication
11. Project management and finance
12. Life-long learning
13
COs-POs Mapping
Mapping of Course Outcomes and Program Outcomes:
Dr. Kumod kumar Gupta Machine Learning Unit 1
Course
Outcome
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
KEC503.1 3 2 2 - - - - - - - - 1
KEC503.2
3 3 3 - - - - - - - - 1
KEC503.3
3 3 3 - - - - - - - - 1
KEC503.4
3 2 1 - - - - - - - - 1
KEC503.5
3 2 2 - - - - - - - - 1
Average 3 2.4 2.2 - - - - - - - - 1
14
Program Specific Outcomes
Dr. Kumod kumar Gupta Machine Learning Unit 1
On successful completion of graduation degree the Electronics and Communication
graduates will be able to:
1. To apply the knowledge of mathematics, science and electronics & communication
engineering to work effectively in the industry based on same or related area.
2. To use their skills to work in modern electronics & communication engineering
tools, software and equipment's to design solutions for complex problems in the
related field that meet the specified needs of the society.
3. To function effectively as an individual and as a member or leader of a team by
qualifying through examinations like GATE, IES, PSUs, TOEFL, GMAT and GRE
etc.
15
COs- PSOs Mapping
Mapping of Course Outcomes and Program Specific Outcomes:
Dr. Kumod kumar Gupta Machine Learning Unit 1
Course Outcome PSO1 PSO2 PSO3
AEC 0516.1 3 - -
AEC0516.2
3 2 -
AEC0516.3
3 2 -
AEC0516.4
3 2 2
AEC0516.5
3 2 -
Average 3 2 2
16
Program Education Objectives
Dr. Kumod kumar Gupta Machine Learning Unit 1
The Program Educational Objectives (PEOs) of B. Tech (Electronics &
Communication Engineering) program are as follows:
1. To have excellent scientific and engineering breadth so as to comprehend,
analyze, design and solve real- life problems using state-of-the-art technology.
2. To lead a successful career in industries or to pursue higher studies or to
understand entrepreneurial endeavors.
3. To effectively bridge the gap between industry and academics through effective
communication skill, professional attitude and a desire to learn.
17
Results Analysis
Dr. Kumod kumar Gupta Machine Learning Unit 1
18
Question Paper
Dr. Kumod kumar Gupta Machine Learning Unit 1
19
Question Paper
Dr. Kumod kumar Gupta Machine Learning Unit 1
20
Question Paper
Dr. Kumod kumar Gupta Machine Learning Unit 1
21
Prerequisite and Recap
Dr. Kumod kumar Gupta Machine Learning Unit 1
The student should have basic knowledge about:
• Concept of machine learning technique.
 Machine learning techniques: To become familiar with
regression methods, classification methods, clustering
methods.
 Scaling up machine learning approaches.
22
Brief Introduction about the subject with Video
Dr. Kumod kumar Gupta Machine Learning Unit 1
https://www.youtube.com/watch?v=ukzFI9rgwfU
23
Unit 5 Content
Dr. Kumod kumar Gupta Machine Learning Unit 1
Brief Introduction to Machine Learning,
Supervised Learning,
Unsupervised Learning
Reinforcement Learning and hypothesis testing.
Probability Basics,
Linear Algebra Statistical Decision Theory –
Regression & Classification
Bias – Variance
Linear Regression
Multivariate Regression.
Mainly the unit’s objectives are:
 Conceptualization and summarization of machine learning: To
introduce students to the basic concepts and techniques of
Machine Learning.
 Machine learning techniques: To become familiar with
regression methods, classification methods, clustering
methods.
 Scaling up machine learning approaches.
24
Objectives of Unit
Dr. Kumod kumar Gupta Machine Learning Unit 1
25
Topic Objective / Topic Outcome
Dr. Kumod kumar Gupta Machine Learning Unit 1
Name of Topic Objective of Topic
Mapping with
CO
Brief Introduction to Machine
Learning,
Supervised Learning,
Unsupervised Learning
Reinforcement Learning and
hypothesis testing.
Probability Basics,
Linear Algebra Statistical
Decision Theory –
Regression & Classification
Bias – Variance
Linear Regression
Multivariate Regression.
Students will be able to learn about
the fundamentals of Machine
learning Methods.
CO1
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 26
Reinforcement Learning
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 27
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 28
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 29
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 30
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 31
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 32
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 33
Markov Chain Process
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 34
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 35
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 36
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 37
Reinforcement Learning
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 38
Reinforcement Learning
4. Reinforcement Learning
• These methods are different from previously studied methods
and very rarely used also.
• In this kind of learning algorithms, there would be an agent
that we want to train over a period of time so that it can
interact with a specific environment.
• The agent will follow a set of strategies for interacting with the
environment and then after observing the environment it will
take actions regards the current state of the environment
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 39
TYPES OF LEARNING(CONT’D)
1/22/2023
• The following are the main steps of reinforcement learning
methods −
Step 1 − First, we need to prepare an agent with some initial set
of strategies.
Step 2 − Then observe the environment and its current state.
Step 3 − Next, select the optimal policy regards the current state
of the environment and perform important action.
Step 4 − Now, the agent can get corresponding reward or penalty
as per accordance with the action taken by it in previous step.
Step 5 − Now, we can update the strategies if it is required so.
Step 6 − At last, repeat steps 2-5 until the agent got to learn and
adopt the optimal policies.
Example : video game, Chess
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 40
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 41
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 42
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 43
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 44
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 45
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 46
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 47
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 48
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 49
TYPES OF LEARNING(CONT’D)
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 50
TYPES OF LEARNING(CONT’D)
1/22/2023
1. Which of the following methods do we use to find the best fit
line for data in Linear Regression?
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
2. Which of the following is true about Residuals ?
A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
51
Daily Quiz
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 52
Weekly Assignment
1. What is Supervised Learning in ML?
2. What is Unsupervised learning in machine learning?
3. What is the difference between supervised and unsupervised
learning?
1/22/2023
53
Recap
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
 Learning is the process of converting experience into
expertise or knowledge.
 Supervised
 Un-supervised
 Semi-supervised
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 54
TYPES OF LEARNING(CONT’D)
1/22/2023
• Regression is a supervised learning technique which helps in
finding the correlation between variables and enables us to
predict the continuous output variable based on the one or
more predictor variables.
• Regression analysis is a statistical method to model the
relationship between a dependent (target) and independent
(predictor) variables with one or more independent variables.
• More specifically, Regression analysis helps us to understand
how the value of the dependent variable is changing
corresponding to an independent variable when other
independent variables are held fixed.
• It predicts continuous/real values such as temperature, age,
salary, price, etc.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 55
WHAT IS REGRESSION?
• In Regression, we plot a graph between the variables which
best fits the given datapoints, using this plot, the machine
learning model can make predictions about the data.
• In simple words, "Regression shows a line or curve that
passes through all the datapoints on target-predictor graph
in such a way that the vertical distance between the
datapoints and the regression line is minimum.
Examples:
• Prediction of rain using temperature and other factors
• Determining Market trends
• Prediction of road accidents due to rash driving
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 56
REGRESSION(CONT’D)
• Dependent Variable: The main factor in Regression analysis which
we want to predict or understand is called the dependent variable. It
is also called target variable.
• Independent Variable: The factors which affect the dependent
variables or which are used to predict the values of the dependent
variables are called independent variable, also called as a predictor.
• Outliers: Outlier is an observation which contains either very low
value or very high value in comparison to other observed values. An
outlier may hamper the result, so it should be avoided.
• Multicollinearity: If the independent variables are highly correlated
with each other than other variables, then such condition is called
Multicollinearity. It should not be present in the dataset, because it
creates problem while ranking the most affecting variable.
• Underfitting and Overfitting: If our algorithm works well with the
training dataset but not well with test dataset, then such problem is
called Overfitting. And if our algorithm does not perform well even
with training dataset, then such problem is called underfitting.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 57
TERMINOLOGIES
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 58
TYPES OF REGRESSION
1. LINEAR REGRESSION
• Linear regression is a statistical regression method which is used for
predictive analysis.
• It is one of the very simple and easy algorithms which works on
regression and shows the relationship between the continuous
variables.
• It is used for solving the regression problem in machine learning.
• Linear regression shows the linear relationship between the
independent variable (X-axis) and the dependent variable (Y-axis),
hence called linear regression.
• If there is only one input variable (x), then such linear regression is
called simple linear regression. And if there is more than one input
variable, then such linear regression is called multiple linear
regression.
• The relationship between variables in the linear regression model can
be explained using the below image. Here we are predicting the salary
of an employee on the basis of the year of experience.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 59
TYPES OF REGRESSION(CONT’D)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 60
TYPES OF REGRESSION(CONT’D)
Below is the mathematical equation for Linear regression:
Y= mX+c
Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables),
m= Slope and C= Intercept
2. MULTIPLE REGRESSION
• Multiple regression generally explains the relationship between
multiple independent or predictor variables and one dependent or
criterion variable.
• A dependent variable is modeled as a function of several
independent variables with corresponding coefficients, along with
the constant term.
• Multiple regression requires two or more predictor variables, and
this is why it is called multiple regression.
• The multiple regression equation explained above takes the
following form:
y = b1x1 + b2x2 + … + bnxn + c.
• Here, bi’s (i=1,2…n) are the regression coefficients, which represent
the value at which the criterion variable changes when the predictor
variable changes.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 61
TYPES OF REGRESSION(CONT’D)
3. POLYNOMIAL REGRESSION
• Polynomial Regression is a type of regression which models
the non-linear dataset using a linear model.
• It is similar to multiple linear regression, but it fits a non-linear
curve between the value of x and corresponding conditional
values of y.
• Suppose there is a dataset which consists of datapoints which
are present in a non-linear fashion, so for such case, linear
regression will not best fit to those datapoints. To cover such
datapoints, we need Polynomial regression.
• In Polynomial regression, the original features are
transformed into polynomial features of given degree and
then modeled using a linear model. Which means the
datapoints are best fitted using a polynomial line.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 62
TYPES OF REGRESSION(CONT’D)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 63
TYPES OF REGRESSION(CONT’D)
•The equation for polynomial regression also derived from linear regression
equation that means Linear regression equation Y= b0+ b1x, is transformed into
Polynomial regression equation Y= b0+b1x+ b2x2+ b3x3+.....+ bnxn.
•Here Y is the predicted/target output, b0, b1,... bn are the regression
coefficients. x is our independent/input variable.
4. SUPPORT VECTOR REGRESSION
• Support Vector Machine is a supervised learning algorithm which can be
used for regression as well as classification problem.
• Support Vector Regression is a regression algorithm which works for
continuous variables.
• Kernel: It is a function used to map a lower-dimensional data into higher
dimensional data.
• Hyperplane: In general SVM, it is a separation line between two classes,
but in SVR, it is a line which helps to predict the continuous variables and
cover most of the datapoints.
• Boundary line: Boundary lines are the two lines apart from hyperplane,
which creates a margin for datapoints.
• Support vectors: Support vectors are the datapoints which are nearest to
the hyperplane and opposite class.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 64
TYPES OF REGRESSION(CONT’D)
• In SVR, we always try to determine a hyperplane with a maximum
margin, so that maximum number of datapoints are covered in that
margin. The main goal of SVR is to consider the maximum
datapoints within the boundary lines and the hyperplane (best-fit
line) must contain a maximum number of datapoints.
1/22/2023 Machine Learning (AEC0516) unit-5 65
TYPES OF REGRESSION(CONT’D)
HYPERPLANE
1/22/2023 Machine Learning (AEC0516) unit-5 66
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 67
Decision Tree(CONT’D)
• Decision Tree is a Supervised learning technique that can be
used for both classification and Regression problems, but
mostly it is preferred for solving Classification problems.
• It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches represent the
decision rules and each leaf node represents the outcome.
• In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any
decision and have multiple branches, whereas Leaf nodes are
the output of those decisions and do not contain any further
branches.
• The decisions or the test are performed on the basis of features
of the given dataset.
1/22/2023 Machine Learning (AEC0516) unit-5 68
Decision Tree
• It is a graphical representation for getting all the possible
solutions to a problem/decision based on given conditions.
• It is called a decision tree because, similar to a tree, it starts
with the root node, which expands on further branches and
constructs a tree-like structure.
• In order to build a tree, we use the CART algorithm, which
stands for Classification and Regression Tree algorithm.
• A decision tree simply asks a question, and based on the
answer (Yes/No), it further split the tree into subtrees.
• NOTE: A decision tree can contain categorical data (YES/NO)
as well as numeric data.
1/22/2023 Machine Learning (AEC0516) unit-5 69
Decision Tree(CONT’D)
• Iterative Dichotomiser 3 or commonly known as ID3.
ID3 was invented by Ross Quinlan.
• It is a classification algorithm that follows a greedy
approach of building a decision tree by selecting a
best attribute that yields maximum Information
Gain (IG) or minimum Entropy (H).
• Decision Tree is most effective if the problem
characteristics look like the following points:
1) Instances can be described by attribute-value pairs.
2) Target function is discrete-valued.
1/22/2023 Machine Learning (AEC0516) unit-5 70
ID3 ALGORITHM
• “Entropy is the measurement of homogeneity.
• It returns us the information about an arbitrary dataset
that how impure/non-homogeneous the data set is.”
• Given a collection of examples/dataset S, containing
positive and negative examples of some target concept,
the entropy of S relative to this boolean classification is-
1/22/2023 Machine Learning (AEC0516) unit-5 71
ID3 ALGORITHM
• When we use a node in a decision tree to partition the
training instances into smaller subsets the entropy
changes. Information gain is a measure of this change
in entropy.
1/22/2023 Machine Learning (AEC0516) unit-5 72
ID3 ALGORITHM
Information Gain = entropy(parent) – [average entropy(children)]
1/22/2023 Machine Learning (AEC0516) unit-5 73
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 74
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 75
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 76
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 77
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 78
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 79
Introduction to Machine
Learning
Entropy=? Entropy= ?
1/22/2023 Machine Learning (AEC0516) unit-5 80
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 81
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 82
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 83
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 84
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 85
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 86
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 87
Introduction to Machine
Learning
1/22/2023 Machine Learning (AEC0516) unit-5 88
Example: Decision Tree for Play Tennis
1/22/2023 Machine Learning (AEC0516) unit-5 89
Example: Decision Tree for Play Tennis
• Here, dataset is of binary classes(yes and no), where
9 out of 14 are "yes" and 5 out of 14 are "no".
Complete entropy of dataset is –
H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no))
= - (9/14) * log2(9/14) - (5/14) * log2(5/14)
= - (-0.41) - (-0.53)
= 0.94
1/22/2023 Machine Learning (AEC0516) unit-5 90
Example: Decision Tree for Play Tennis
• First Attribute - Outlook
• Categorical values - sunny, overcast and rain
• H(Outlook=sunny) = -(2/5)*log(2/5)-(3/5)*log(3/5) =0.971
• H(Outlook=rain) = -(3/5)*log(3/5)-(2/5)*log(2/5) =0.971
• H(Outlook=overcast) = -(4/4)*log(4/4)-0 = 0
• Average Entropy Information for Outlook -
• I(Outlook) = p(sunny) * H(Outlook=sunny) + p(rain) * H(Outlook=rain) +
p(overcast) * H(Outlook=overcast)
= (5/14)*0.971 + (5/14)*0.971 + (4/14)*0
= 0.693
• Information Gain = H(S) - I(Outlook)
= 0.94 - 0.693
= 0.247
1/22/2023 Machine Learning (AEC0516) unit-5 91
Example: Decision Tree for Play Tennis
Second Attribute - Temperature
• Categorical values - hot, mild, cool
• H(Temperature=hot) = -(2/4)*log(2/4)-(2/4)*log(2/4) = 1
• H(Temperature=cool) = -(3/4)*log(3/4)-(1/4)*log(1/4) = 0.811
• H(Temperature=mild) = -(4/6)*log(4/6)-(2/6)*log(2/6) = 0.9179
• Average Entropy Information for Temperature -
• I(Temperature) = p(hot)*H(Temperature=hot) +
p(mild)*H(Temperature=mild) + p(cool)*H(Temperature=cool)
= (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811
= 0.9108
• Information Gain = H(S) - I(Temperature)
= 0.94 - 0.9108
= 0.0292
1/22/2023 Machine Learning (AEC0516) unit-5 92
Example: Decision Tree for Play Tennis
• Third Attribute - Humidity
• Categorical values - high, normal
• H(Humidity=high) = -(3/7)*log(3/7)-(4/7)*log(4/7) = 0.983
• H(Humidity=normal) = -(6/7)*log(6/7)-(1/7)*log(1/7) = 0.591
• Average Entropy Information for Humidity -
• I(Humidity) = p(high)*H(Humidity=high) +
p(normal)*H(Humidity=normal)
= (7/14)*0.983 + (7/14)*0.591
= 0.787
• Information Gain = H(S) - I(Humidity)
= 0.94 - 0.787
= 0.153
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 93
Example: Decision Tree for Play Tennis
• Fourth Attribute - Wind
• Categorical values - weak, strong
• H(Wind=weak) = -(6/8)*log(6/8)-(2/8)*log(2/8) = 0.811
• H(Wind=strong) = -(3/6)*log(3/6)-(3/6)*log(3/6) = 1
• Average Entropy Information for Wind -
• I(Wind) = p(weak)*H(Wind=weak) + p(strong)*H(Wind=strong)
= (8/14)*0.811 + (6/14)*1
= 0.892
• Information Gain = H(S) - I(Wind)
= 0.94 - 0.892
= 0.048
• Here, the attribute with maximum information gain is Outlook.
So, the decision tree built so far -
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 94
Example: Decision Tree for Play Tennis
Here, when Outlook == overcast, it is of pure class(Yes).
Now, we have to repeat same procedure for the data with rows consist
of Outlook value as Sunny and then for Outlook value as Rain
• Now, finding the best attribute for splitting the data
with Outlook=Sunny values{ Dataset rows = [1, 2, 8,
9, 11]
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 95
Example: Decision Tree for Play Tennis
Complete entropy of Sunny is –
H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no))
= - (2/5) * log2(2/5) - (3/5) * log2(3/5)
= 0.971
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 96
Example: Decision Tree for Play Tennis
First Attribute - Temperature
• Categorical values - hot, mild, cool
H(Sunny, Temperature=hot) = -0-(2/2)*log(2/2) = 0
H(Sunny, Temperature=cool) = -(1)*log(1)- 0 = 0
H(Sunny, Temperature=mild) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1
• Average Entropy Information for Temperature -
I(Sunny, Temperature) = p(Sunny, hot)*H(Sunny, Temperature=hot) +
p(Sunny, mild)*H(Sunny, Temperature=mild) + p(Sunny, cool)*H(Sunny,
Temperature=cool)
= (2/5)*0 + (1/5)*0 + (2/5)*1
= 0.4
• Information Gain = H(Sunny) - I(Sunny, Temperature)
= 0.971 - 0.4
= 0.571
• Second Attribute - Humidity
• Categorical values - high, normal
• H(Sunny, Humidity=high) = - 0 - (3/3)*log(3/3) = 0
• H(Sunny, Humidity=normal) = -(2/2)*log(2/2)-0 = 0
• Average Entropy Information for Humidity -
I(Sunny, Humidity) = p(Sunny, high)*H(Sunny, Humidity=high) +
p(Sunny, normal)*H(Sunny, Humidity=normal)
= (3/5)*0 + (2/5)*0
= 0
• Information Gain = H(Sunny) - I(Sunny, Humidity)
= 0.971 – 0
= 0.971
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 97
Example: Decision Tree for Play Tennis
Third Attribute - Wind
Categorical values - weak, strong
• H(Sunny, Wind=weak) = -(1/3)*log(1/3)-(2/3)*log(2/3) = 0.918
• H(Sunny, Wind=strong) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1
• Average Entropy Information for Wind -
I(Sunny, Wind) = p(Sunny, weak)*H(Sunny, Wind=weak) + p(Sunny,
strong)*H(Sunny, Wind=strong)
= (3/5)*0.918 + (2/5)*1
= 0.9508
• Information Gain = H(Sunny) - I(Sunny, Wind)
= 0.971 - 0.9508
= 0.0202
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 98
Example: Decision Tree for Play Tennis
• Here, the attribute with maximum information gain is
Humidity. So, the decision tree built so far -
1/22/2023
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
99
Example: Decision Tree for Play Tennis
Here, when Outlook = Sunny and Humidity = High, it is a pure class of category "no".
And When Outlook = Sunny and Humidity = Normal, it is again a pure class of
category "yes". Therefore, we don't need to do further calculations
• Now, finding the best attribute for splitting the data
with Outlook=Sunny values { Dataset rows = [4, 5, 6,
10, 1]}
• Complete entropy of Rain is -
H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no))
= - (3/5) * log(3/5) - (2/5) * log(2/5)
= 0.971
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 100
Example: Decision Tree for Play Tennis
• First Attribute – Temperature
Categorical values - mild, cool
• H(Rain, Temperature=cool) = -(1/2)*log(1/2)- (1/2)*log(1/2) = 1
• H(Rain, Temperature=mild) = -(2/3)*log(2/3)-(1/3)*log(1/3) = 0.918
Average Entropy Information for Temperature -
I(Rain, Temperature) = p(Rain, mild)*H(Rain, Temperature=mild) +
p(Rain, cool)*H(Rain, Temperature=cool)
= (2/5)*1 + (3/5)*0.918
= 0.9508
• Information Gain = H(Rain) - I(Rain, Temperature)
= 0.971 - 0.9508
= 0.0202
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 101
Example: Decision Tree for Play Tennis
Second Attribute - Wind
• Categorical values - weak, strong
H(Wind=weak) = -(3/3)*log(3/3)-0 = 0
H(Wind=strong) = 0-(2/2)*log(2/2) = 0
• Average Entropy Information for Wind -
I(Wind) = p(Rain, weak)*H(Rain, Wind=weak) + p(Rain,
strong)*H(Rain, Wind=strong)
= (3/5)*0 + (2/5)*0
= 0
• Information Gain = H(Rain) - I(Rain, Wind)
= 0.971 - 0
= 0.971
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 102
Example: Decision Tree for Play Tennis
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 103
Example: Decision Tree for Play Tennis
Here, the attribute with maximum information gain is Wind. So, the decision tree
built so far -
Soar Throat Fever Swallen
Glands
Congestion Headache Diagnosis
YES YES YES YES YES STREP THROAT
NO NO NO YES YES ALLERGY
YES YES NO YES NO COLD
YES NO YES NO NO STREP THROAT
NO YES NO YES NO COLD
NO NO NO YES NO ALLERGY
NO NO YES NO NO STREP THROAT
YES NO NO YES YES ALLERGY
NO YES NO YES YES COLD
YES NO NO YES YES COLD
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 104
EXAMPLE 2: PRACTICE
For the medical diagnosis data , create a decision tree:
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 105
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 106
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 107
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 108
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 109
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 110
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 111
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 112
Classification and Regression Tree(CART)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 113
Classification and Regression Tree(CART)
• The inductive bias (also known as learning bias) of a learning
algorithm is the set of assumptions that the learner uses to
predict outputs of given inputs that it has not encountered
• Inductive Bias in ID3
– Approximate inductive bias of ID3
• Shorter trees are preferred over larger tress
• BFS-ID3
– A closer approximation to the inductive bias of ID3
• Shorter trees are preferred over longer trees. Trees that place
high information gain attributes close to the root are preferred
over those that do not.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 114
INDUCTIVE BIAS IN DECISION TREE
• ID3
– Searches a complete hypothesis space incompletely
– Inductive bias is solely a consequence of the ordering of
hypotheses by its search strategy
• Candidate-Elimination
– Searches an incomplete hypothesis space completely
– Inductive bias is solely a consequence of the expressive power
of its hypothesis representation
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 115
DIFFERENCE BETWEEN ID3 AND
CANDIDATE _ELIMINATION
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 116
ISSUES IN DECISION TREE
Practical issues in learning decision trees include
• Determining how deeply to grow the decision tree,
• Handling continuous attributes,
• Choosing an appropriate attribute selection measure,
• Handling training data with missing attribute values,
• Handling attributes with differing costs, and
• Improving computational efficiency.
5. RANDOM FOREST REGRESSION
• Random Forest is a popular machine learning algorithm that belongs to the
supervised learning technique.
• It can be used for both Classification and Regression problems in ML.
• It is based on the concept of ensemble learning, which is a process
of combining multiple classifiers to solve a complex problem and to
improve the performance of the model.
• As the name suggests, "Random Forest is a classifier that contains a
number of decision trees on various subsets of the given dataset and takes
the average to improve the predictive accuracy of that dataset." Instead of
relying on one decision tree, the random forest takes the prediction from
each tree and based on the majority votes of predictions, and it predicts the
final output.
• The greater number of trees in the forest leads to higher accuracy and
prevents the problem of overfitting
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 117
TYPES OF REGRESSION(CONT’D)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 118
TYPES OF REGRESSION(CONT’D)
The below diagram explains the working of the Random Forest
algorithm:
• Example: Suppose there is a dataset that contains multiple fruit images.
So, this dataset is given to the Random forest classifier. The dataset is
divided into subsets and given to each decision tree. During the training
phase, each decision tree produces a prediction result, and when a new data
point occurs, then based on the majority of results, the Random Forest
classifier predicts the final decision. Consider the below image:
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 119
TYPES OF REGRESSION(CONT’D)
• The Machine Learning systems which are categorized
as instance-based learning are the systems that learn the
training examples by heart and then generalizes to new
instances based on some similarity measure.
• It is called instance-based because it builds the hypotheses
from the training instances.
• It is also known as memory-based learning or lazy-
learning.
• The time complexity of this algorithm depends upon the size
of training data. The worst-case time complexity of this
algorithm is O (n), where n is the number of training
instances.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 120
INSTANCE BASED LEARNING
• Some of the instance-based learning algorithms are :
1. K Nearest Neighbor (KNN)
2. Self-Organizing Map (SOM)
3. Learning Vector Quantization (LVQ)
4. Locally Weighted Learning (LWL)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 121
INSTANCE BASED LEARNING
• Advantages:
1. Instead of estimating for the entire instance set, local
approximations can be made to the target function.
2. This algorithm can adapt to new data easily, one which is
collected as we go.
• Disadvantages:
1. Classification costs are high
2. Large amount of memory required to store the data, and each
query involves starting the identification of a local model from
scratch.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 122
INSTANCE BASED LEARNING(CONT’d)
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 123
K-NEAREST NEIGHBHOR(KNN) LEARNING
• K-Nearest Neighbor is one of the simplest Machine Learning
algorithms based on Supervised Learning technique.
• K-NN algorithm assumes the similarity between the new
case/data and available cases and put the new case into the
category that is most similar to the available categories.
• K-NN algorithm stores all the available data and classifies a
new data point based on the similarity. This means when new
data appears then it can be easily classified into a well suite
category by using K- NN algorithm.
• K-NN algorithm can be used for Regression as well as for
Classification but mostly it is used for the Classification
problems.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 124
K-NEAREST NEIGHBHOR(KNN) LEARNING
• K-NN is a non-parametric algorithm, which means it does
not make any assumption on underlying data.
• It is also called a lazy learner algorithm because it does not
learn from the training set immediately instead it stores the
dataset and at the time of classification, it performs an action
on the dataset.
• KNN algorithm at the training phase just stores the dataset and
when it gets new data, then it classifies that data into a
category that is much similar to the new data.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 125
KNN ALGORITHM
The K-NN working can be explained on the basis of the below
algorithm:
•Step-1: Select the number K of the neighbors
•Step-2: Calculate the Euclidean distance of K number of
neighbors
•Step-3: Take the K nearest neighbors as per the calculated
Euclidean distance.
•Step-4: Among these k neighbors, count the number of the
data points in each category.
•Step-5: Assign the new data points to that category for which
the number of the neighbor is maximum.
•Step-6: Our model is ready.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 126
WORKING OF KNN
Suppose we have a new data point and we need to put it in the
required category. Consider the below image:
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 127
WORKING OF KNN
•Firstly, we will choose the number of neighbors, so we will choose the k=5.
•Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have already
studied in geometry. It can be calculated as:
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 128
WORKING OF KNN
• By calculating the Euclidean distance we got the nearest neighbors, as
three nearest neighbors in category A and two nearest neighbors in
category B. Consider the below image:
• As we can see the 3 nearest neighbors are from category A,
hence this new data point must belong to category
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 129
PROS AND CONS OF KNN
Advantages of KNN Algorithm:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
Disadvantages of KNN Algorithm:
• Always needs to determine the value of K which may be
complex some time.
• The computation cost is high because of calculating the
distance between the data points for all the training samples.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 130
NUMERICAL ON KNN
P1 P2 Class
7 7 False
7 4 False
1 4 True
3 4 True
Perform KNN Classification Algorithm on the following data set and predict the
class for x (P1=3 and P2=7), where k=3
Euclidean Distance =
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 131
NUMERICAL ON KNN
• D(x,i)= √ 2 − 7 2+ 7 − 7 2 = 4
• D(x,ii)= √ 3 − 7 2+ 7 − 4 2 = 5
• D(x,iii)= √ 3 − 3 2+ 7 − 4 2 = 3
• D(x,iv)= √ 3 − 1 2+ 7 − 4 2 = 3.6
We, need to find out the three nearest neighbors that means, the
distance having the lowest value: 3, 3.6 and 4
TRUE
TRUE
FALSE
Thus, X(p1= 2 and p2=7) will belong to class True .
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 132
WHY KNN IS NON-PARAMETRIC?
Non-parametric means not making any assumptions on the
underlying data distribution. Non-parametric methods do not
have fixed numbers of parameters in the model. Similarly in
KNN, model parameters actually grows with the training data set
- you can imagine each training case as a "parameter" in the
model.
Height (in cms) Weight (in kgs) T Shirt Size
158 58 M
158 59 M
158 63 M
160 59 M
160 60 M
163 60 M
163 61 M
160 64 L
163 64 L
165 61 L
165 62 L
165 65 L
168 62 L
168 63 L
168 66 L
170 63 L
170 64 L
170 68 L
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 133
PRACTICE NUMERICAL ON KNN
Suppose we have height, weight and T-shirt size of some customers and we need to predict the T-
shirt size of a new customer given only height and weight information we have. Data including
height, weight and T-shirt size information is shown below -
• One of the problems in liner regression is that it tries
to fit a constant line to you data once the model was
created.
• Such behaviour might be okay when your data
follows linear pattern and does not have much noise.
• However, when the data set is not linear, linear
regression tends to under fit the training data.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 134
PROBLEMS IN LINEAR REGRESSION
• Model-based methods, such as neural networks and the
mixture of Gaussians, use the data to build a parameterized
model.
• After training, the model is used for predictions and the data
are generally discarded.
• In contrast, ``memory-based'' methods are non-parametric
approaches that explicitly retain the training data, and use it
each time a prediction needs to be made.
• Locally weighted regression (LWR) is a memory-based
method that performs a regression around a point of interest
using only training data that are ``local'' to that point.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 135
LOCALLY WEIGHTED REGRESSION
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 136
LOCALLY WEIGHTED REGRESSION
In locally weighted regression, points are weighted by proximity to the current x in
question using a kernel. A regression is then computed using the weighted points.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 137
CASE BASED LEARNING
• Case-Based Reasoning classifiers (CBR) use a database of
problem solutions to solve new problems. It stores the tuples
or cases for problem-solving as complex symbolic
descriptions.
HOW CBR WORKS
• When a new case arrises to classify, a Case-based
Reasoner(CBR) will first check if an identical training case
exists.
• If one is found, then the accompanying solution to that case is
returned.
• If no identical case is found, then the CBR will search for
training cases having components that are similar to those of
the new case.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 138
CASE BASED LEARNING
• Conceptually, these training cases may be considered as
neighbours of the new case.
• If cases are represented as graphs, this involves searching for
subgraphs that are similar to subgraphs within the new case.
• The CBR tries to combine the solutions of the neighbouring
training cases to propose a solution for the new case.
• If compatibilities arise with the individual solutions, then
backtracking to search for other solutions may be necessary.
• The CBR may employ background knowledge and problem-
solving strategies to propose a feasible solution.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 139
APPLICATIONS OF CBR
1.Problem resolution for customer service help desks, where
cases describe product-related diagnostic problems.
2.It is also applied to areas such as engineering and law, where
cases are either technical designs or legal rulings, respectively.
3.Medical educations, where patient case histories and treatments
are used to help diagnose and treat new patients.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 140
PRACTICE NUMERICALS 1
Using A KNN algorithm , predict what class of fan michelle is.
Given that Michelle is a Female and age is 5. Assume k=3
NAME AGE GENDER FAN
BILL 32 M Rolling Stone
HENRY 40 M Neither
MARY 16 F Taylor Swift
TIFFNY 14 F Taylor Swift
MICHAEL 55 M Neither
CARLOS 40 M Taylor Swift
ASHELY 20 F Neither
ROBERT 15 M Taylor Swift
SALLY 55 F Rolling Stone
JOHN 15 M Rolling Stone
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 141
Solution
NAME AGE GENDER DISTANCE FAN
BILL 32 M=0 27.02 Rolling Stone
HENRY 40 M=0 35.01 Neither
MARY 16 F=1 11.00 Taylor Swift
TIFFNY 14 F=1 9.00 Taylor Swift
MICHAEL 55 M=0 50.01 Neither
CARLOS 40 M=0 35.01 Taylor Swift
ASHELY 20 F=1 15.00 Neither
ROBERT 15 M=0 10.00 Taylor Swift
SALLY 55 F=1 50.00 Rolling Stone
JOHN 15 M=0 10.05 Rolling Stone
Convert the discrete Value of Gender attribute to Numeric value. Let us
assume M=0 and F=1
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 142
PRACTICE NUMERICALS 2
Using A KNN algorithm ,Predict new flower(sepal length=5.2
,sepal width=3.1) . Assume k=5
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 143
PRACTICE NUMERICALS 2
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 144
PRACTICE NUMERICALS 3
Build a decision tree using ID3 algorithm.
• 1. Create a root node for the tree
2. If all examples are positive, return leaf node
‘positive’
3. Else if all examples are negative, return leaf
node ‘negative’
4. Calculate the entropy of current state H(S)
5. For each attribute, calculate the entropy with
respect to the attribute ‘x’ denoted by H(S, x)
6. Select the attribute which has the maximum
value of IG(S, x)
7. Remove the attribute that offers highest IG from
the set of attributes
8. Repeat until we run out of all attributes, or the
decision tree has all leaf nodes.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 145
ID3 Algorithm will perform following
tasks recursively
Step 1: The initial step is to calculate H(S), the Entropy
of the current state. In the above example, we can see in
total there are 5 No’s and 9 Yes’s.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 146
SOLUTION 3
Step 2 : The next step is to calculate H(S,x), the entropy with
respect to the attribute ‘x’ for each attribute. In the above
example, The expected information needed to classify a tuple in
‘S’ if the tuples are partitioned according to age is,
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 147
SOLUTION 3
Hence, the gain in information from such partitioning would be,
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 148
SOLUTION 3
Similarly,
Step 3: Choose attribute with the largest information gain, IG(S,x) as the
decision node, divide the dataset by its branches and repeat the same process
on every branch. Age has the highest information gain among the attributes, so
Age is selected as the splitting attribute.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 149
SOLUTION 3
Step 4a: A branch with an entropy of 0 is a leaf node.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 150
SOLUTION 3
Step 4b : A branch with entropy more than 0 needs
further splitting.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 151
SOLUTION 3
Step 5: The ID3 algorithm is run recursively on the non-leaf
branches until all data is classified.
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 152
SOLUTION 3
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 153
PRACTICE NUMERICALS 4
Apply ID3 Algorithm to construct the tree-structured
Youtube/Other Video Links:
1. Machine Learning by Prof. Balaraman ravindran, Department of
computer science and engineering,IIT Madras (SWAYAM/NPTEL)
https://www.youtube.com/watch?v=fC7V8QsPBec&feature=youtu.be
2. Machine Learning by Prof. Sudeshna Sarkar, Department of
computer science and engineering,IIT Kharagpur (NPTEL)
https://www.youtube.com/watch?v=EWmCkVfPnJ8&list=PLlGkyYY
WOSOsGU-XARWdIFsRAJQkyBrVj&index=2
3. Machine learning UPGRAD course by IIIT,Bangalore
https://www.upgrad.com/machine-learning-ai-pgd-iiitb/
1/22/2023 154
Faculty Video Links, Youtube & NPTEL
Video Links and Online Courses Details
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
1. Decision trees are an algorithm for which machine learning task?
a) clustering
b) dimensionality reduction
c) classification
d) regression
2. Which error metric is most appropriate for evaluating a {0,1} classification
task?
a) worst-case error
b) sum of squares error
c) entropy
d) precision and recall
1/22/2023 155
Daily Quiz
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 156
Daily Quiz
3. A _________ is a decision support tool that uses a tree-like graph or model
of decisions and their possible consequences, including chance event
outcomes, resource costs, and utility.
a)Decision tree
b) Graphs
c) Trees
d) Neural Networks
4. Which of the following are the advantage/s of Decision Trees?
a)Possible Scenarios can be added
b) Use a white box model, If given result is provided by a model
c) Worst, best and expected values can be determined for different scenarios
d) All of the mentioned
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 157
Daily Quiz
5. Which of the following algorithm doesn’t uses learning Rate as of one of
its hyperparameter?
a) Gradient Boosting
b) Extra Trees
c) AdaBoost
d) Random Forest
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 158
Weekly Assignment
NOIDA INSTITUTE OF ENGINEERING & TECHNOLOGY, GREATER NOIDA
SEMESTER (EVEN)
UNIT:-
ASSIGNMENT SHEET No.
2
2
Subject Name: - Machine Learning
Name of Course Coordinator:- Shweta Mayor
Subject Code: - AMTAI0201 (M. Tech)
1.Define Decision tree with example.
2. Explain types of decision tree.
3. Explain Optimizing Decision Tree Performance.
4. Explain Overfitting and Underfitting
5. What is Artificial Intelligence?
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 159
Weekly Assignment
NOIDA INSTITUTE OF ENGINEERING & TECHNOLOGY, GREATER NOIDA
SEMESTER (EVEN)
UNIT:-
ASSIGNMENT SHEET No.
2
2
6. What is the difference between supervised and unsupervised
machine learning?
7. List the different algorithm techniques in Machine Learning
8. Differentiate between supervised, unsupervised, and
reinforcement learning.
9. What is perceptron in Machine Learning?
10. What is model accuracy and model performance?
1. Which of the factors affect the performance of learner system
does not include?
a) Representation scheme used
b) Training scenario
c) Type of feedback
d) Good data structures
2. What is true regarding backpropagation rule?
a) it is also called generalized delta rule
b) error in output is propagated backwards only to determine
weight updates
c) there is no feedback of signal at nay stage
d) all of the mentioned
1/22/2023 160
MCQ s
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
Sub Code: MTCS031 Paper Id: 210201
M. TECH.
(SEM-II) THEORY EXAMINATION 2018-19
MACHINE LEARNING
Time: 3 Hours Total Marks: 70
Note: Attempt all Sections. If require any missing data; then choose suitably.
SECTION A
Attempt all questions in brief. 2 x 7 = 14
a. Define Machine Learning?
b. Explain regression model.
c. What is ANN?
d. Explain Well defined learning problems.
e. Define Decision tree.
f. Explain Bayes classifier.
g. Explain Q Learning.
1/22/2023 161
Old Question Papers
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
SECTION B
Attempt any three of the following: 7 x 3 = 21
a. Explain the role of genetic algorithm in knowledge based technique.
b. Differentiate between Genetic algorithm & traditional algorithm with suitable example.
c. Explain various ANN architecture in detail.
d. Describe any algorithm to implement simulated annealing.
e. Explain DBSCAN with its role in forming clusters.
SECTION C
Attempt any one part of the following: 7 x 1 = 7
(a)Explain back propagation algorithm with suitable example.
(b) Explain learning with any two learning techniques with its expression for weight updating.
Attempt any one part of the following: 7 x 1 = 7
(a) Write Short Note on followings:
(i) Sampling Theory
(ii) Bayes Theorem
(b) Explain any comparing learning technique with suitable example.
1/22/2023 162
Old Question Papers
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
Attempt any one part of the following: 7 x 1 = 7
(a) Explain the followings (i) Generalization (ii) Multilayer Network
(b) Describe decision tree learning algorithm with example.
Attempt any one part of the following: 7 x 1 = 7
(a) Define the process of designing a learning system. Explain various issues in
Machine learning
(b) Explain Candidate elimination algorithm in detail.
Attempt any one part of the following: 7 x 1 = 7
(a) Explain FOIL in detail.
(b) Explain the followings:
(i) Hypotheses (ii) Inductive Bias (iii) Perceptron.
1/22/2023 163
Old Question Papers
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
1. Explain the various types of issues in machine learning.
2. Define the learning classifiers.
3. Differentiated between Bayesian Learning and Instance based
Learning.
4. Discuss the steps in KNN algorithm and its applications of it.
5. Explain back propagation algorithm and derive expressions
for weight update relations.
6. Describe the ID3 Algorithm with a proper example.
1/22/2023 164
Expected Questions for University Exam
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
1/22/2023 165
Summary
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
Perceptron training rule guaranteed to succeed if
1. Training examples are linearly separable
2. Sufficiently small learning rate η
Adaline training rule uses gradient descent
1. Guaranteed to converge to hypothesis with minimum
2. Given sufficiently small learning rate η
1/22/2023 166
Summary
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
3. Even when training data contains noise
4. Even when training data not separable by H Problems
Slow convergence to local or global minimum
1/22/2023 167
References
Reference Books:
Introduction to Statistical Learning, Springer, 2013 By Gareth
James, Daniela Witten, Trevor Hastie, Robert Tibshirani.
Pattern Classification, 2nd Ed., John Wiley & Sons, 2001, Richard
Duda, Peter Hart, David Stork.
Machine Learning, McGraw Hill International Edition, by
Tom.M.Mitchell.
Introduction to Machine Learning, Eastern Economy Edition,
Prentice Hall of India, 2005 By Ethern Alpaydin.
Pattern Recognition and Machine Learning. Berlin: Springer-
Verlag., Bishop, C.
Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 168

More Related Content

Similar to Machine Learning UNIT-5 kumod part 2 (1).pptx

OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...shohi1
 
SE_Computer_Engg__2019_course_28_06_2021 (6).pdf
SE_Computer_Engg__2019_course_28_06_2021 (6).pdfSE_Computer_Engg__2019_course_28_06_2021 (6).pdf
SE_Computer_Engg__2019_course_28_06_2021 (6).pdftomlee12821
 
Introduction to Software Engineering For Students
Introduction to Software Engineering For StudentsIntroduction to Software Engineering For Students
Introduction to Software Engineering For Studentskimdokja738
 
Syllabus for fourth year of engineering
Syllabus for fourth year of engineeringSyllabus for fourth year of engineering
Syllabus for fourth year of engineeringtakshakpdesai
 
Applied technology presentation 4slide
Applied technology presentation 4slideApplied technology presentation 4slide
Applied technology presentation 4slideDave Ricker
 
Digi-Award-Qualifiaction-Specification_2022-1.pdf
Digi-Award-Qualifiaction-Specification_2022-1.pdfDigi-Award-Qualifiaction-Specification_2022-1.pdf
Digi-Award-Qualifiaction-Specification_2022-1.pdfAdam Nguyen
 
IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...
IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...
IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...IRJET Journal
 
Cloud_Storage
Cloud_Storage Cloud_Storage
Cloud_Storage Larvish1
 
EG87 MSc Motorsport Engineering programme specification
EG87 MSc Motorsport Engineering programme specificationEG87 MSc Motorsport Engineering programme specification
EG87 MSc Motorsport Engineering programme specificationDhanaprasanth K S
 
NBA Presentation EE Main.pptx
NBA Presentation EE Main.pptxNBA Presentation EE Main.pptx
NBA Presentation EE Main.pptxVinitMehta32
 
B.E.Mech.pdf for regulation 2021 anna university students
B.E.Mech.pdf for regulation 2021 anna university studentsB.E.Mech.pdf for regulation 2021 anna university students
B.E.Mech.pdf for regulation 2021 anna university studentsRajasekaran Ponnuswamy
 
Bastaic Goran_Diploma Supplement
Bastaic Goran_Diploma SupplementBastaic Goran_Diploma Supplement
Bastaic Goran_Diploma SupplementGoran Bastaic
 
BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...
BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...
BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...ciyamala kushbu
 
Unit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdf
Unit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdfUnit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdf
Unit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdfsomilkumar15
 
CS8383 Object Oriented Programming Laboratory Manual
CS8383 Object Oriented Programming Laboratory ManualCS8383 Object Oriented Programming Laboratory Manual
CS8383 Object Oriented Programming Laboratory ManualMuthu Vinayagam
 
4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)Aditya66086
 
Maintenance and Safety Engineering.pdf
Maintenance and Safety Engineering.pdfMaintenance and Safety Engineering.pdf
Maintenance and Safety Engineering.pdfJohnPaulAbabaLandoy
 
B.E.Mech R2021 Syllabus.pdf
B.E.Mech R2021 Syllabus.pdfB.E.Mech R2021 Syllabus.pdf
B.E.Mech R2021 Syllabus.pdfBeemkumarN
 
Mechatronics engineering syllabus in thapar institute
Mechatronics engineering syllabus in thapar instituteMechatronics engineering syllabus in thapar institute
Mechatronics engineering syllabus in thapar instituteGrey William
 

Similar to Machine Learning UNIT-5 kumod part 2 (1).pptx (20)

OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
 
SE_Computer_Engg__2019_course_28_06_2021 (6).pdf
SE_Computer_Engg__2019_course_28_06_2021 (6).pdfSE_Computer_Engg__2019_course_28_06_2021 (6).pdf
SE_Computer_Engg__2019_course_28_06_2021 (6).pdf
 
Introduction to Software Engineering For Students
Introduction to Software Engineering For StudentsIntroduction to Software Engineering For Students
Introduction to Software Engineering For Students
 
Ade manual with co po-18scheme
Ade manual with co po-18schemeAde manual with co po-18scheme
Ade manual with co po-18scheme
 
Syllabus for fourth year of engineering
Syllabus for fourth year of engineeringSyllabus for fourth year of engineering
Syllabus for fourth year of engineering
 
Applied technology presentation 4slide
Applied technology presentation 4slideApplied technology presentation 4slide
Applied technology presentation 4slide
 
Digi-Award-Qualifiaction-Specification_2022-1.pdf
Digi-Award-Qualifiaction-Specification_2022-1.pdfDigi-Award-Qualifiaction-Specification_2022-1.pdf
Digi-Award-Qualifiaction-Specification_2022-1.pdf
 
IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...
IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...
IRJET- Teaching Learning Practices for Metrology & Quality Control Subject in...
 
Cloud_Storage
Cloud_Storage Cloud_Storage
Cloud_Storage
 
EG87 MSc Motorsport Engineering programme specification
EG87 MSc Motorsport Engineering programme specificationEG87 MSc Motorsport Engineering programme specification
EG87 MSc Motorsport Engineering programme specification
 
NBA Presentation EE Main.pptx
NBA Presentation EE Main.pptxNBA Presentation EE Main.pptx
NBA Presentation EE Main.pptx
 
B.E.Mech.pdf for regulation 2021 anna university students
B.E.Mech.pdf for regulation 2021 anna university studentsB.E.Mech.pdf for regulation 2021 anna university students
B.E.Mech.pdf for regulation 2021 anna university students
 
Bastaic Goran_Diploma Supplement
Bastaic Goran_Diploma SupplementBastaic Goran_Diploma Supplement
Bastaic Goran_Diploma Supplement
 
BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...
BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...
BCS302- Digital Design and computer organization -VTU-2022 scheme-Expectation...
 
Unit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdf
Unit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdfUnit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdf
Unit5 updated ML sdṅ f,hs f.hs gs.,f hs .pdf
 
CS8383 Object Oriented Programming Laboratory Manual
CS8383 Object Oriented Programming Laboratory ManualCS8383 Object Oriented Programming Laboratory Manual
CS8383 Object Oriented Programming Laboratory Manual
 
4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)4.74 s.e. computer engineering (1)
4.74 s.e. computer engineering (1)
 
Maintenance and Safety Engineering.pdf
Maintenance and Safety Engineering.pdfMaintenance and Safety Engineering.pdf
Maintenance and Safety Engineering.pdf
 
B.E.Mech R2021 Syllabus.pdf
B.E.Mech R2021 Syllabus.pdfB.E.Mech R2021 Syllabus.pdf
B.E.Mech R2021 Syllabus.pdf
 
Mechatronics engineering syllabus in thapar institute
Mechatronics engineering syllabus in thapar instituteMechatronics engineering syllabus in thapar institute
Mechatronics engineering syllabus in thapar institute
 

Recently uploaded

NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...Amil baba
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfSkNahidulIslamShrabo
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...IJECEIAES
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docxrahulmanepalli02
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptjigup7320
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Studentskannan348865
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationEmaan Sharma
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentationsj9399037128
 
What is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsWhat is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsVIEW
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..MaherOthman7
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.benjamincojr
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdfKamal Acharya
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualBalamuruganV28
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptamrabdallah9
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligencemahaffeycheryld
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisDr.Costas Sachpazis
 

Recently uploaded (20)

NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdf
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Students
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
 
What is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, FunctionsWhat is Coordinate Measuring Machine? CMM Types, Features, Functions
What is Coordinate Measuring Machine? CMM Types, Features, Functions
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdf
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.ppt
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligence
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
 

Machine Learning UNIT-5 kumod part 2 (1).pptx

  • 1. Noida Institute of Engineering and Technology, Greater Noida ML Classifiers Faculty Details: Dr. Laxman Singh Associate professor ECE (AI) Department 1/22/2023 1 Unit: 5 Machine Learning (AEC0516) unit-5 Subject Name: Machine Learning (AEC0516) Course Details: B. Tech (V SEM)
  • 2. Noida Institute of Engineering and Technology, Greater Noida
  • 8. 8 Course Objectives Course Name Machine Learning (KEC-503) Year : Third Year / Fifth Semester KEC 503.1 The machine learning and basics of statistics and probability theory. KEC 503.2 Neurons, neural networks, and multilayer perceptron. KEC 503.3 Identification of the dimensionality of data and its reduction using various mathematical concepts as well as probabilistic learning. Machine Learning Unit 1
  • 9. 9 Course Objectives Course Name Machine Learning (KEC-503) Year : Third Year / Fifth Semester KEC 503.4 Various search and optimization techniques to the raw data. KEC 503.5 Various learning techniques and approaches. Dr. Kumod kumar Gupta Machine Learning Unit 1
  • 10. 10 Course Outcomes (COs) COUR SE OUTC OME NO. COURSE OUTCOMES After completion of this course, students will be able to CO1 Describe the basic concepts of machine learning, statistics, and probability theory. CO2 Define and describe the Neurons, neural networks, and multilayer perceptron. CO3 Identify the dimensionality of data and reduces it using various mathematical concepts as well as describe the probabilistic learning. Machine Learning Unit 1
  • 11. 11 Course Outcomes (COs) COURSE OUTCOME NO. COURSE OUTCOMES After completion of this course, students will be able to CO4 Describe and apply various search and optimization techniques to the raw data. CO5 Illustrate and apply various learning techniques. Machine Learning Unit 1
  • 12. 12 Program Outcomes Machine Learning Unit 1 • Program Outcomes are narrow statements that describe what the students are expected to know and would be able to do upon the graduation. • These relate to the skills, knowledge, and behavior that students acquire through the programmed. 1. Engineering knowledge 2. Problem analysis 3. Design/development of solutions 4. Conduct investigations of complex problems 5. Modern tool usage 6. The engineer and society 7. Environment and sustainability 8. Ethics 9. Individual and team work 10. Communication 11. Project management and finance 12. Life-long learning
  • 13. 13 COs-POs Mapping Mapping of Course Outcomes and Program Outcomes: Dr. Kumod kumar Gupta Machine Learning Unit 1 Course Outcome PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 KEC503.1 3 2 2 - - - - - - - - 1 KEC503.2 3 3 3 - - - - - - - - 1 KEC503.3 3 3 3 - - - - - - - - 1 KEC503.4 3 2 1 - - - - - - - - 1 KEC503.5 3 2 2 - - - - - - - - 1 Average 3 2.4 2.2 - - - - - - - - 1
  • 14. 14 Program Specific Outcomes Dr. Kumod kumar Gupta Machine Learning Unit 1 On successful completion of graduation degree the Electronics and Communication graduates will be able to: 1. To apply the knowledge of mathematics, science and electronics & communication engineering to work effectively in the industry based on same or related area. 2. To use their skills to work in modern electronics & communication engineering tools, software and equipment's to design solutions for complex problems in the related field that meet the specified needs of the society. 3. To function effectively as an individual and as a member or leader of a team by qualifying through examinations like GATE, IES, PSUs, TOEFL, GMAT and GRE etc.
  • 15. 15 COs- PSOs Mapping Mapping of Course Outcomes and Program Specific Outcomes: Dr. Kumod kumar Gupta Machine Learning Unit 1 Course Outcome PSO1 PSO2 PSO3 AEC 0516.1 3 - - AEC0516.2 3 2 - AEC0516.3 3 2 - AEC0516.4 3 2 2 AEC0516.5 3 2 - Average 3 2 2
  • 16. 16 Program Education Objectives Dr. Kumod kumar Gupta Machine Learning Unit 1 The Program Educational Objectives (PEOs) of B. Tech (Electronics & Communication Engineering) program are as follows: 1. To have excellent scientific and engineering breadth so as to comprehend, analyze, design and solve real- life problems using state-of-the-art technology. 2. To lead a successful career in industries or to pursue higher studies or to understand entrepreneurial endeavors. 3. To effectively bridge the gap between industry and academics through effective communication skill, professional attitude and a desire to learn.
  • 17. 17 Results Analysis Dr. Kumod kumar Gupta Machine Learning Unit 1
  • 18. 18 Question Paper Dr. Kumod kumar Gupta Machine Learning Unit 1
  • 19. 19 Question Paper Dr. Kumod kumar Gupta Machine Learning Unit 1
  • 20. 20 Question Paper Dr. Kumod kumar Gupta Machine Learning Unit 1
  • 21. 21 Prerequisite and Recap Dr. Kumod kumar Gupta Machine Learning Unit 1 The student should have basic knowledge about: • Concept of machine learning technique.  Machine learning techniques: To become familiar with regression methods, classification methods, clustering methods.  Scaling up machine learning approaches.
  • 22. 22 Brief Introduction about the subject with Video Dr. Kumod kumar Gupta Machine Learning Unit 1 https://www.youtube.com/watch?v=ukzFI9rgwfU
  • 23. 23 Unit 5 Content Dr. Kumod kumar Gupta Machine Learning Unit 1 Brief Introduction to Machine Learning, Supervised Learning, Unsupervised Learning Reinforcement Learning and hypothesis testing. Probability Basics, Linear Algebra Statistical Decision Theory – Regression & Classification Bias – Variance Linear Regression Multivariate Regression.
  • 24. Mainly the unit’s objectives are:  Conceptualization and summarization of machine learning: To introduce students to the basic concepts and techniques of Machine Learning.  Machine learning techniques: To become familiar with regression methods, classification methods, clustering methods.  Scaling up machine learning approaches. 24 Objectives of Unit Dr. Kumod kumar Gupta Machine Learning Unit 1
  • 25. 25 Topic Objective / Topic Outcome Dr. Kumod kumar Gupta Machine Learning Unit 1 Name of Topic Objective of Topic Mapping with CO Brief Introduction to Machine Learning, Supervised Learning, Unsupervised Learning Reinforcement Learning and hypothesis testing. Probability Basics, Linear Algebra Statistical Decision Theory – Regression & Classification Bias – Variance Linear Regression Multivariate Regression. Students will be able to learn about the fundamentals of Machine learning Methods. CO1
  • 26. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 26 Reinforcement Learning Reinforcement Learning
  • 27. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 27 Reinforcement Learning
  • 28. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 28 Reinforcement Learning
  • 29. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 29 Reinforcement Learning
  • 30. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 30 Reinforcement Learning
  • 31. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 31 Reinforcement Learning
  • 32. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 32 Reinforcement Learning
  • 33. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 33 Markov Chain Process
  • 34. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 34 Reinforcement Learning
  • 35. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 35 Reinforcement Learning
  • 36. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 36 Reinforcement Learning
  • 37. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 37 Reinforcement Learning
  • 38. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 38 Reinforcement Learning
  • 39. 4. Reinforcement Learning • These methods are different from previously studied methods and very rarely used also. • In this kind of learning algorithms, there would be an agent that we want to train over a period of time so that it can interact with a specific environment. • The agent will follow a set of strategies for interacting with the environment and then after observing the environment it will take actions regards the current state of the environment Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 39 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 40. • The following are the main steps of reinforcement learning methods − Step 1 − First, we need to prepare an agent with some initial set of strategies. Step 2 − Then observe the environment and its current state. Step 3 − Next, select the optimal policy regards the current state of the environment and perform important action. Step 4 − Now, the agent can get corresponding reward or penalty as per accordance with the action taken by it in previous step. Step 5 − Now, we can update the strategies if it is required so. Step 6 − At last, repeat steps 2-5 until the agent got to learn and adopt the optimal policies. Example : video game, Chess Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 40 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 41. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 41 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 42. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 42 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 43. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 43 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 44. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 44 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 45. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 45 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 46. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 46 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 47. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 47 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 48. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 48 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 49. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 49 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 50. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 50 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 51. 1. Which of the following methods do we use to find the best fit line for data in Linear Regression? A) Least Square Error B) Maximum Likelihood C) Logarithmic Loss D) Both A and B 2. Which of the following is true about Residuals ? A) Lower is better B) Higher is better C) A or B depend on the situation D) None of these 51 Daily Quiz Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 1/22/2023
  • 52. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 52 Weekly Assignment 1. What is Supervised Learning in ML? 2. What is Unsupervised learning in machine learning? 3. What is the difference between supervised and unsupervised learning? 1/22/2023
  • 53. 53 Recap Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5  Learning is the process of converting experience into expertise or knowledge.  Supervised  Un-supervised  Semi-supervised 1/22/2023
  • 54. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 54 TYPES OF LEARNING(CONT’D) 1/22/2023
  • 55. • Regression is a supervised learning technique which helps in finding the correlation between variables and enables us to predict the continuous output variable based on the one or more predictor variables. • Regression analysis is a statistical method to model the relationship between a dependent (target) and independent (predictor) variables with one or more independent variables. • More specifically, Regression analysis helps us to understand how the value of the dependent variable is changing corresponding to an independent variable when other independent variables are held fixed. • It predicts continuous/real values such as temperature, age, salary, price, etc. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 55 WHAT IS REGRESSION?
  • 56. • In Regression, we plot a graph between the variables which best fits the given datapoints, using this plot, the machine learning model can make predictions about the data. • In simple words, "Regression shows a line or curve that passes through all the datapoints on target-predictor graph in such a way that the vertical distance between the datapoints and the regression line is minimum. Examples: • Prediction of rain using temperature and other factors • Determining Market trends • Prediction of road accidents due to rash driving 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 56 REGRESSION(CONT’D)
  • 57. • Dependent Variable: The main factor in Regression analysis which we want to predict or understand is called the dependent variable. It is also called target variable. • Independent Variable: The factors which affect the dependent variables or which are used to predict the values of the dependent variables are called independent variable, also called as a predictor. • Outliers: Outlier is an observation which contains either very low value or very high value in comparison to other observed values. An outlier may hamper the result, so it should be avoided. • Multicollinearity: If the independent variables are highly correlated with each other than other variables, then such condition is called Multicollinearity. It should not be present in the dataset, because it creates problem while ranking the most affecting variable. • Underfitting and Overfitting: If our algorithm works well with the training dataset but not well with test dataset, then such problem is called Overfitting. And if our algorithm does not perform well even with training dataset, then such problem is called underfitting. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 57 TERMINOLOGIES
  • 58. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 58 TYPES OF REGRESSION
  • 59. 1. LINEAR REGRESSION • Linear regression is a statistical regression method which is used for predictive analysis. • It is one of the very simple and easy algorithms which works on regression and shows the relationship between the continuous variables. • It is used for solving the regression problem in machine learning. • Linear regression shows the linear relationship between the independent variable (X-axis) and the dependent variable (Y-axis), hence called linear regression. • If there is only one input variable (x), then such linear regression is called simple linear regression. And if there is more than one input variable, then such linear regression is called multiple linear regression. • The relationship between variables in the linear regression model can be explained using the below image. Here we are predicting the salary of an employee on the basis of the year of experience. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 59 TYPES OF REGRESSION(CONT’D)
  • 60. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 60 TYPES OF REGRESSION(CONT’D) Below is the mathematical equation for Linear regression: Y= mX+c Here, Y = dependent variables (target variables), X= Independent variables (predictor variables), m= Slope and C= Intercept
  • 61. 2. MULTIPLE REGRESSION • Multiple regression generally explains the relationship between multiple independent or predictor variables and one dependent or criterion variable. • A dependent variable is modeled as a function of several independent variables with corresponding coefficients, along with the constant term. • Multiple regression requires two or more predictor variables, and this is why it is called multiple regression. • The multiple regression equation explained above takes the following form: y = b1x1 + b2x2 + … + bnxn + c. • Here, bi’s (i=1,2…n) are the regression coefficients, which represent the value at which the criterion variable changes when the predictor variable changes. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 61 TYPES OF REGRESSION(CONT’D)
  • 62. 3. POLYNOMIAL REGRESSION • Polynomial Regression is a type of regression which models the non-linear dataset using a linear model. • It is similar to multiple linear regression, but it fits a non-linear curve between the value of x and corresponding conditional values of y. • Suppose there is a dataset which consists of datapoints which are present in a non-linear fashion, so for such case, linear regression will not best fit to those datapoints. To cover such datapoints, we need Polynomial regression. • In Polynomial regression, the original features are transformed into polynomial features of given degree and then modeled using a linear model. Which means the datapoints are best fitted using a polynomial line. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 62 TYPES OF REGRESSION(CONT’D)
  • 63. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 63 TYPES OF REGRESSION(CONT’D) •The equation for polynomial regression also derived from linear regression equation that means Linear regression equation Y= b0+ b1x, is transformed into Polynomial regression equation Y= b0+b1x+ b2x2+ b3x3+.....+ bnxn. •Here Y is the predicted/target output, b0, b1,... bn are the regression coefficients. x is our independent/input variable.
  • 64. 4. SUPPORT VECTOR REGRESSION • Support Vector Machine is a supervised learning algorithm which can be used for regression as well as classification problem. • Support Vector Regression is a regression algorithm which works for continuous variables. • Kernel: It is a function used to map a lower-dimensional data into higher dimensional data. • Hyperplane: In general SVM, it is a separation line between two classes, but in SVR, it is a line which helps to predict the continuous variables and cover most of the datapoints. • Boundary line: Boundary lines are the two lines apart from hyperplane, which creates a margin for datapoints. • Support vectors: Support vectors are the datapoints which are nearest to the hyperplane and opposite class. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 64 TYPES OF REGRESSION(CONT’D)
  • 65. • In SVR, we always try to determine a hyperplane with a maximum margin, so that maximum number of datapoints are covered in that margin. The main goal of SVR is to consider the maximum datapoints within the boundary lines and the hyperplane (best-fit line) must contain a maximum number of datapoints. 1/22/2023 Machine Learning (AEC0516) unit-5 65 TYPES OF REGRESSION(CONT’D) HYPERPLANE
  • 66. 1/22/2023 Machine Learning (AEC0516) unit-5 66 Introduction to Machine Learning
  • 67. 1/22/2023 Machine Learning (AEC0516) unit-5 67 Decision Tree(CONT’D)
  • 68. • Decision Tree is a Supervised learning technique that can be used for both classification and Regression problems, but mostly it is preferred for solving Classification problems. • It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome. • In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not contain any further branches. • The decisions or the test are performed on the basis of features of the given dataset. 1/22/2023 Machine Learning (AEC0516) unit-5 68 Decision Tree
  • 69. • It is a graphical representation for getting all the possible solutions to a problem/decision based on given conditions. • It is called a decision tree because, similar to a tree, it starts with the root node, which expands on further branches and constructs a tree-like structure. • In order to build a tree, we use the CART algorithm, which stands for Classification and Regression Tree algorithm. • A decision tree simply asks a question, and based on the answer (Yes/No), it further split the tree into subtrees. • NOTE: A decision tree can contain categorical data (YES/NO) as well as numeric data. 1/22/2023 Machine Learning (AEC0516) unit-5 69 Decision Tree(CONT’D)
  • 70. • Iterative Dichotomiser 3 or commonly known as ID3. ID3 was invented by Ross Quinlan. • It is a classification algorithm that follows a greedy approach of building a decision tree by selecting a best attribute that yields maximum Information Gain (IG) or minimum Entropy (H). • Decision Tree is most effective if the problem characteristics look like the following points: 1) Instances can be described by attribute-value pairs. 2) Target function is discrete-valued. 1/22/2023 Machine Learning (AEC0516) unit-5 70 ID3 ALGORITHM
  • 71. • “Entropy is the measurement of homogeneity. • It returns us the information about an arbitrary dataset that how impure/non-homogeneous the data set is.” • Given a collection of examples/dataset S, containing positive and negative examples of some target concept, the entropy of S relative to this boolean classification is- 1/22/2023 Machine Learning (AEC0516) unit-5 71 ID3 ALGORITHM
  • 72. • When we use a node in a decision tree to partition the training instances into smaller subsets the entropy changes. Information gain is a measure of this change in entropy. 1/22/2023 Machine Learning (AEC0516) unit-5 72 ID3 ALGORITHM Information Gain = entropy(parent) – [average entropy(children)]
  • 73. 1/22/2023 Machine Learning (AEC0516) unit-5 73 Introduction to Machine Learning
  • 74. 1/22/2023 Machine Learning (AEC0516) unit-5 74 Introduction to Machine Learning
  • 75. 1/22/2023 Machine Learning (AEC0516) unit-5 75 Introduction to Machine Learning
  • 76. 1/22/2023 Machine Learning (AEC0516) unit-5 76 Introduction to Machine Learning
  • 77. 1/22/2023 Machine Learning (AEC0516) unit-5 77 Introduction to Machine Learning
  • 78. 1/22/2023 Machine Learning (AEC0516) unit-5 78 Introduction to Machine Learning
  • 79. 1/22/2023 Machine Learning (AEC0516) unit-5 79 Introduction to Machine Learning Entropy=? Entropy= ?
  • 80. 1/22/2023 Machine Learning (AEC0516) unit-5 80 Introduction to Machine Learning
  • 81. 1/22/2023 Machine Learning (AEC0516) unit-5 81 Introduction to Machine Learning
  • 82. 1/22/2023 Machine Learning (AEC0516) unit-5 82 Introduction to Machine Learning
  • 83. 1/22/2023 Machine Learning (AEC0516) unit-5 83 Introduction to Machine Learning
  • 84. 1/22/2023 Machine Learning (AEC0516) unit-5 84 Introduction to Machine Learning
  • 85. 1/22/2023 Machine Learning (AEC0516) unit-5 85 Introduction to Machine Learning
  • 86. 1/22/2023 Machine Learning (AEC0516) unit-5 86 Introduction to Machine Learning
  • 87. 1/22/2023 Machine Learning (AEC0516) unit-5 87 Introduction to Machine Learning
  • 88. 1/22/2023 Machine Learning (AEC0516) unit-5 88 Example: Decision Tree for Play Tennis
  • 89. 1/22/2023 Machine Learning (AEC0516) unit-5 89 Example: Decision Tree for Play Tennis • Here, dataset is of binary classes(yes and no), where 9 out of 14 are "yes" and 5 out of 14 are "no". Complete entropy of dataset is – H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no)) = - (9/14) * log2(9/14) - (5/14) * log2(5/14) = - (-0.41) - (-0.53) = 0.94
  • 90. 1/22/2023 Machine Learning (AEC0516) unit-5 90 Example: Decision Tree for Play Tennis • First Attribute - Outlook • Categorical values - sunny, overcast and rain • H(Outlook=sunny) = -(2/5)*log(2/5)-(3/5)*log(3/5) =0.971 • H(Outlook=rain) = -(3/5)*log(3/5)-(2/5)*log(2/5) =0.971 • H(Outlook=overcast) = -(4/4)*log(4/4)-0 = 0 • Average Entropy Information for Outlook - • I(Outlook) = p(sunny) * H(Outlook=sunny) + p(rain) * H(Outlook=rain) + p(overcast) * H(Outlook=overcast) = (5/14)*0.971 + (5/14)*0.971 + (4/14)*0 = 0.693 • Information Gain = H(S) - I(Outlook) = 0.94 - 0.693 = 0.247
  • 91. 1/22/2023 Machine Learning (AEC0516) unit-5 91 Example: Decision Tree for Play Tennis Second Attribute - Temperature • Categorical values - hot, mild, cool • H(Temperature=hot) = -(2/4)*log(2/4)-(2/4)*log(2/4) = 1 • H(Temperature=cool) = -(3/4)*log(3/4)-(1/4)*log(1/4) = 0.811 • H(Temperature=mild) = -(4/6)*log(4/6)-(2/6)*log(2/6) = 0.9179 • Average Entropy Information for Temperature - • I(Temperature) = p(hot)*H(Temperature=hot) + p(mild)*H(Temperature=mild) + p(cool)*H(Temperature=cool) = (4/14)*1 + (6/14)*0.9179 + (4/14)*0.811 = 0.9108 • Information Gain = H(S) - I(Temperature) = 0.94 - 0.9108 = 0.0292
  • 92. 1/22/2023 Machine Learning (AEC0516) unit-5 92 Example: Decision Tree for Play Tennis • Third Attribute - Humidity • Categorical values - high, normal • H(Humidity=high) = -(3/7)*log(3/7)-(4/7)*log(4/7) = 0.983 • H(Humidity=normal) = -(6/7)*log(6/7)-(1/7)*log(1/7) = 0.591 • Average Entropy Information for Humidity - • I(Humidity) = p(high)*H(Humidity=high) + p(normal)*H(Humidity=normal) = (7/14)*0.983 + (7/14)*0.591 = 0.787 • Information Gain = H(S) - I(Humidity) = 0.94 - 0.787 = 0.153
  • 93. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 93 Example: Decision Tree for Play Tennis • Fourth Attribute - Wind • Categorical values - weak, strong • H(Wind=weak) = -(6/8)*log(6/8)-(2/8)*log(2/8) = 0.811 • H(Wind=strong) = -(3/6)*log(3/6)-(3/6)*log(3/6) = 1 • Average Entropy Information for Wind - • I(Wind) = p(weak)*H(Wind=weak) + p(strong)*H(Wind=strong) = (8/14)*0.811 + (6/14)*1 = 0.892 • Information Gain = H(S) - I(Wind) = 0.94 - 0.892 = 0.048
  • 94. • Here, the attribute with maximum information gain is Outlook. So, the decision tree built so far - 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 94 Example: Decision Tree for Play Tennis Here, when Outlook == overcast, it is of pure class(Yes). Now, we have to repeat same procedure for the data with rows consist of Outlook value as Sunny and then for Outlook value as Rain
  • 95. • Now, finding the best attribute for splitting the data with Outlook=Sunny values{ Dataset rows = [1, 2, 8, 9, 11] 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 95 Example: Decision Tree for Play Tennis Complete entropy of Sunny is – H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no)) = - (2/5) * log2(2/5) - (3/5) * log2(3/5) = 0.971
  • 96. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 96 Example: Decision Tree for Play Tennis First Attribute - Temperature • Categorical values - hot, mild, cool H(Sunny, Temperature=hot) = -0-(2/2)*log(2/2) = 0 H(Sunny, Temperature=cool) = -(1)*log(1)- 0 = 0 H(Sunny, Temperature=mild) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1 • Average Entropy Information for Temperature - I(Sunny, Temperature) = p(Sunny, hot)*H(Sunny, Temperature=hot) + p(Sunny, mild)*H(Sunny, Temperature=mild) + p(Sunny, cool)*H(Sunny, Temperature=cool) = (2/5)*0 + (1/5)*0 + (2/5)*1 = 0.4 • Information Gain = H(Sunny) - I(Sunny, Temperature) = 0.971 - 0.4 = 0.571
  • 97. • Second Attribute - Humidity • Categorical values - high, normal • H(Sunny, Humidity=high) = - 0 - (3/3)*log(3/3) = 0 • H(Sunny, Humidity=normal) = -(2/2)*log(2/2)-0 = 0 • Average Entropy Information for Humidity - I(Sunny, Humidity) = p(Sunny, high)*H(Sunny, Humidity=high) + p(Sunny, normal)*H(Sunny, Humidity=normal) = (3/5)*0 + (2/5)*0 = 0 • Information Gain = H(Sunny) - I(Sunny, Humidity) = 0.971 – 0 = 0.971 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 97 Example: Decision Tree for Play Tennis
  • 98. Third Attribute - Wind Categorical values - weak, strong • H(Sunny, Wind=weak) = -(1/3)*log(1/3)-(2/3)*log(2/3) = 0.918 • H(Sunny, Wind=strong) = -(1/2)*log(1/2)-(1/2)*log(1/2) = 1 • Average Entropy Information for Wind - I(Sunny, Wind) = p(Sunny, weak)*H(Sunny, Wind=weak) + p(Sunny, strong)*H(Sunny, Wind=strong) = (3/5)*0.918 + (2/5)*1 = 0.9508 • Information Gain = H(Sunny) - I(Sunny, Wind) = 0.971 - 0.9508 = 0.0202 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 98 Example: Decision Tree for Play Tennis
  • 99. • Here, the attribute with maximum information gain is Humidity. So, the decision tree built so far - 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 99 Example: Decision Tree for Play Tennis Here, when Outlook = Sunny and Humidity = High, it is a pure class of category "no". And When Outlook = Sunny and Humidity = Normal, it is again a pure class of category "yes". Therefore, we don't need to do further calculations
  • 100. • Now, finding the best attribute for splitting the data with Outlook=Sunny values { Dataset rows = [4, 5, 6, 10, 1]} • Complete entropy of Rain is - H(S) = - p(yes) * log2(p(yes)) - p(no) * log2(p(no)) = - (3/5) * log(3/5) - (2/5) * log(2/5) = 0.971 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 100 Example: Decision Tree for Play Tennis
  • 101. • First Attribute – Temperature Categorical values - mild, cool • H(Rain, Temperature=cool) = -(1/2)*log(1/2)- (1/2)*log(1/2) = 1 • H(Rain, Temperature=mild) = -(2/3)*log(2/3)-(1/3)*log(1/3) = 0.918 Average Entropy Information for Temperature - I(Rain, Temperature) = p(Rain, mild)*H(Rain, Temperature=mild) + p(Rain, cool)*H(Rain, Temperature=cool) = (2/5)*1 + (3/5)*0.918 = 0.9508 • Information Gain = H(Rain) - I(Rain, Temperature) = 0.971 - 0.9508 = 0.0202 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 101 Example: Decision Tree for Play Tennis
  • 102. Second Attribute - Wind • Categorical values - weak, strong H(Wind=weak) = -(3/3)*log(3/3)-0 = 0 H(Wind=strong) = 0-(2/2)*log(2/2) = 0 • Average Entropy Information for Wind - I(Wind) = p(Rain, weak)*H(Rain, Wind=weak) + p(Rain, strong)*H(Rain, Wind=strong) = (3/5)*0 + (2/5)*0 = 0 • Information Gain = H(Rain) - I(Rain, Wind) = 0.971 - 0 = 0.971 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 102 Example: Decision Tree for Play Tennis
  • 103. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 103 Example: Decision Tree for Play Tennis Here, the attribute with maximum information gain is Wind. So, the decision tree built so far -
  • 104. Soar Throat Fever Swallen Glands Congestion Headache Diagnosis YES YES YES YES YES STREP THROAT NO NO NO YES YES ALLERGY YES YES NO YES NO COLD YES NO YES NO NO STREP THROAT NO YES NO YES NO COLD NO NO NO YES NO ALLERGY NO NO YES NO NO STREP THROAT YES NO NO YES YES ALLERGY NO YES NO YES YES COLD YES NO NO YES YES COLD 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 104 EXAMPLE 2: PRACTICE For the medical diagnosis data , create a decision tree:
  • 105. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 105 Classification and Regression Tree(CART)
  • 106. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 106 Classification and Regression Tree(CART)
  • 107. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 107 Classification and Regression Tree(CART)
  • 108. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 108 Classification and Regression Tree(CART)
  • 109. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 109 Classification and Regression Tree(CART)
  • 110. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 110 Classification and Regression Tree(CART)
  • 111. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 111 Classification and Regression Tree(CART)
  • 112. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 112 Classification and Regression Tree(CART)
  • 113. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 113 Classification and Regression Tree(CART)
  • 114. • The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered • Inductive Bias in ID3 – Approximate inductive bias of ID3 • Shorter trees are preferred over larger tress • BFS-ID3 – A closer approximation to the inductive bias of ID3 • Shorter trees are preferred over longer trees. Trees that place high information gain attributes close to the root are preferred over those that do not. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 114 INDUCTIVE BIAS IN DECISION TREE
  • 115. • ID3 – Searches a complete hypothesis space incompletely – Inductive bias is solely a consequence of the ordering of hypotheses by its search strategy • Candidate-Elimination – Searches an incomplete hypothesis space completely – Inductive bias is solely a consequence of the expressive power of its hypothesis representation 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 115 DIFFERENCE BETWEEN ID3 AND CANDIDATE _ELIMINATION
  • 116. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 116 ISSUES IN DECISION TREE Practical issues in learning decision trees include • Determining how deeply to grow the decision tree, • Handling continuous attributes, • Choosing an appropriate attribute selection measure, • Handling training data with missing attribute values, • Handling attributes with differing costs, and • Improving computational efficiency.
  • 117. 5. RANDOM FOREST REGRESSION • Random Forest is a popular machine learning algorithm that belongs to the supervised learning technique. • It can be used for both Classification and Regression problems in ML. • It is based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model. • As the name suggests, "Random Forest is a classifier that contains a number of decision trees on various subsets of the given dataset and takes the average to improve the predictive accuracy of that dataset." Instead of relying on one decision tree, the random forest takes the prediction from each tree and based on the majority votes of predictions, and it predicts the final output. • The greater number of trees in the forest leads to higher accuracy and prevents the problem of overfitting 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 117 TYPES OF REGRESSION(CONT’D)
  • 118. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 118 TYPES OF REGRESSION(CONT’D) The below diagram explains the working of the Random Forest algorithm:
  • 119. • Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the Random forest classifier. The dataset is divided into subsets and given to each decision tree. During the training phase, each decision tree produces a prediction result, and when a new data point occurs, then based on the majority of results, the Random Forest classifier predicts the final decision. Consider the below image: 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 119 TYPES OF REGRESSION(CONT’D)
  • 120. • The Machine Learning systems which are categorized as instance-based learning are the systems that learn the training examples by heart and then generalizes to new instances based on some similarity measure. • It is called instance-based because it builds the hypotheses from the training instances. • It is also known as memory-based learning or lazy- learning. • The time complexity of this algorithm depends upon the size of training data. The worst-case time complexity of this algorithm is O (n), where n is the number of training instances. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 120 INSTANCE BASED LEARNING
  • 121. • Some of the instance-based learning algorithms are : 1. K Nearest Neighbor (KNN) 2. Self-Organizing Map (SOM) 3. Learning Vector Quantization (LVQ) 4. Locally Weighted Learning (LWL) 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 121 INSTANCE BASED LEARNING
  • 122. • Advantages: 1. Instead of estimating for the entire instance set, local approximations can be made to the target function. 2. This algorithm can adapt to new data easily, one which is collected as we go. • Disadvantages: 1. Classification costs are high 2. Large amount of memory required to store the data, and each query involves starting the identification of a local model from scratch. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 122 INSTANCE BASED LEARNING(CONT’d)
  • 123. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 123 K-NEAREST NEIGHBHOR(KNN) LEARNING • K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on Supervised Learning technique. • K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories. • K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm. • K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the Classification problems.
  • 124. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 124 K-NEAREST NEIGHBHOR(KNN) LEARNING • K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data. • It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it stores the dataset and at the time of classification, it performs an action on the dataset. • KNN algorithm at the training phase just stores the dataset and when it gets new data, then it classifies that data into a category that is much similar to the new data.
  • 125. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 125 KNN ALGORITHM The K-NN working can be explained on the basis of the below algorithm: •Step-1: Select the number K of the neighbors •Step-2: Calculate the Euclidean distance of K number of neighbors •Step-3: Take the K nearest neighbors as per the calculated Euclidean distance. •Step-4: Among these k neighbors, count the number of the data points in each category. •Step-5: Assign the new data points to that category for which the number of the neighbor is maximum. •Step-6: Our model is ready.
  • 126. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 126 WORKING OF KNN Suppose we have a new data point and we need to put it in the required category. Consider the below image:
  • 127. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 127 WORKING OF KNN •Firstly, we will choose the number of neighbors, so we will choose the k=5. •Next, we will calculate the Euclidean distance between the data points. The Euclidean distance is the distance between two points, which we have already studied in geometry. It can be calculated as:
  • 128. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 128 WORKING OF KNN • By calculating the Euclidean distance we got the nearest neighbors, as three nearest neighbors in category A and two nearest neighbors in category B. Consider the below image: • As we can see the 3 nearest neighbors are from category A, hence this new data point must belong to category
  • 129. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 129 PROS AND CONS OF KNN Advantages of KNN Algorithm: • It is simple to implement. • It is robust to the noisy training data • It can be more effective if the training data is large. Disadvantages of KNN Algorithm: • Always needs to determine the value of K which may be complex some time. • The computation cost is high because of calculating the distance between the data points for all the training samples.
  • 130. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 130 NUMERICAL ON KNN P1 P2 Class 7 7 False 7 4 False 1 4 True 3 4 True Perform KNN Classification Algorithm on the following data set and predict the class for x (P1=3 and P2=7), where k=3 Euclidean Distance =
  • 131. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 131 NUMERICAL ON KNN • D(x,i)= √ 2 − 7 2+ 7 − 7 2 = 4 • D(x,ii)= √ 3 − 7 2+ 7 − 4 2 = 5 • D(x,iii)= √ 3 − 3 2+ 7 − 4 2 = 3 • D(x,iv)= √ 3 − 1 2+ 7 − 4 2 = 3.6 We, need to find out the three nearest neighbors that means, the distance having the lowest value: 3, 3.6 and 4 TRUE TRUE FALSE Thus, X(p1= 2 and p2=7) will belong to class True .
  • 132. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 132 WHY KNN IS NON-PARAMETRIC? Non-parametric means not making any assumptions on the underlying data distribution. Non-parametric methods do not have fixed numbers of parameters in the model. Similarly in KNN, model parameters actually grows with the training data set - you can imagine each training case as a "parameter" in the model.
  • 133. Height (in cms) Weight (in kgs) T Shirt Size 158 58 M 158 59 M 158 63 M 160 59 M 160 60 M 163 60 M 163 61 M 160 64 L 163 64 L 165 61 L 165 62 L 165 65 L 168 62 L 168 63 L 168 66 L 170 63 L 170 64 L 170 68 L 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 133 PRACTICE NUMERICAL ON KNN Suppose we have height, weight and T-shirt size of some customers and we need to predict the T- shirt size of a new customer given only height and weight information we have. Data including height, weight and T-shirt size information is shown below -
  • 134. • One of the problems in liner regression is that it tries to fit a constant line to you data once the model was created. • Such behaviour might be okay when your data follows linear pattern and does not have much noise. • However, when the data set is not linear, linear regression tends to under fit the training data. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 134 PROBLEMS IN LINEAR REGRESSION
  • 135. • Model-based methods, such as neural networks and the mixture of Gaussians, use the data to build a parameterized model. • After training, the model is used for predictions and the data are generally discarded. • In contrast, ``memory-based'' methods are non-parametric approaches that explicitly retain the training data, and use it each time a prediction needs to be made. • Locally weighted regression (LWR) is a memory-based method that performs a regression around a point of interest using only training data that are ``local'' to that point. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 135 LOCALLY WEIGHTED REGRESSION
  • 136. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 136 LOCALLY WEIGHTED REGRESSION In locally weighted regression, points are weighted by proximity to the current x in question using a kernel. A regression is then computed using the weighted points.
  • 137. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 137 CASE BASED LEARNING • Case-Based Reasoning classifiers (CBR) use a database of problem solutions to solve new problems. It stores the tuples or cases for problem-solving as complex symbolic descriptions. HOW CBR WORKS • When a new case arrises to classify, a Case-based Reasoner(CBR) will first check if an identical training case exists. • If one is found, then the accompanying solution to that case is returned. • If no identical case is found, then the CBR will search for training cases having components that are similar to those of the new case.
  • 138. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 138 CASE BASED LEARNING • Conceptually, these training cases may be considered as neighbours of the new case. • If cases are represented as graphs, this involves searching for subgraphs that are similar to subgraphs within the new case. • The CBR tries to combine the solutions of the neighbouring training cases to propose a solution for the new case. • If compatibilities arise with the individual solutions, then backtracking to search for other solutions may be necessary. • The CBR may employ background knowledge and problem- solving strategies to propose a feasible solution.
  • 139. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 139 APPLICATIONS OF CBR 1.Problem resolution for customer service help desks, where cases describe product-related diagnostic problems. 2.It is also applied to areas such as engineering and law, where cases are either technical designs or legal rulings, respectively. 3.Medical educations, where patient case histories and treatments are used to help diagnose and treat new patients.
  • 140. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 140 PRACTICE NUMERICALS 1 Using A KNN algorithm , predict what class of fan michelle is. Given that Michelle is a Female and age is 5. Assume k=3 NAME AGE GENDER FAN BILL 32 M Rolling Stone HENRY 40 M Neither MARY 16 F Taylor Swift TIFFNY 14 F Taylor Swift MICHAEL 55 M Neither CARLOS 40 M Taylor Swift ASHELY 20 F Neither ROBERT 15 M Taylor Swift SALLY 55 F Rolling Stone JOHN 15 M Rolling Stone
  • 141. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 141 Solution NAME AGE GENDER DISTANCE FAN BILL 32 M=0 27.02 Rolling Stone HENRY 40 M=0 35.01 Neither MARY 16 F=1 11.00 Taylor Swift TIFFNY 14 F=1 9.00 Taylor Swift MICHAEL 55 M=0 50.01 Neither CARLOS 40 M=0 35.01 Taylor Swift ASHELY 20 F=1 15.00 Neither ROBERT 15 M=0 10.00 Taylor Swift SALLY 55 F=1 50.00 Rolling Stone JOHN 15 M=0 10.05 Rolling Stone Convert the discrete Value of Gender attribute to Numeric value. Let us assume M=0 and F=1
  • 142. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 142 PRACTICE NUMERICALS 2 Using A KNN algorithm ,Predict new flower(sepal length=5.2 ,sepal width=3.1) . Assume k=5
  • 143. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 143 PRACTICE NUMERICALS 2
  • 144. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 144 PRACTICE NUMERICALS 3 Build a decision tree using ID3 algorithm.
  • 145. • 1. Create a root node for the tree 2. If all examples are positive, return leaf node ‘positive’ 3. Else if all examples are negative, return leaf node ‘negative’ 4. Calculate the entropy of current state H(S) 5. For each attribute, calculate the entropy with respect to the attribute ‘x’ denoted by H(S, x) 6. Select the attribute which has the maximum value of IG(S, x) 7. Remove the attribute that offers highest IG from the set of attributes 8. Repeat until we run out of all attributes, or the decision tree has all leaf nodes. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 145 ID3 Algorithm will perform following tasks recursively
  • 146. Step 1: The initial step is to calculate H(S), the Entropy of the current state. In the above example, we can see in total there are 5 No’s and 9 Yes’s. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 146 SOLUTION 3
  • 147. Step 2 : The next step is to calculate H(S,x), the entropy with respect to the attribute ‘x’ for each attribute. In the above example, The expected information needed to classify a tuple in ‘S’ if the tuples are partitioned according to age is, 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 147 SOLUTION 3
  • 148. Hence, the gain in information from such partitioning would be, 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 148 SOLUTION 3 Similarly,
  • 149. Step 3: Choose attribute with the largest information gain, IG(S,x) as the decision node, divide the dataset by its branches and repeat the same process on every branch. Age has the highest information gain among the attributes, so Age is selected as the splitting attribute. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 149 SOLUTION 3
  • 150. Step 4a: A branch with an entropy of 0 is a leaf node. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 150 SOLUTION 3
  • 151. Step 4b : A branch with entropy more than 0 needs further splitting. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 151 SOLUTION 3
  • 152. Step 5: The ID3 algorithm is run recursively on the non-leaf branches until all data is classified. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 152 SOLUTION 3
  • 153. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 153 PRACTICE NUMERICALS 4 Apply ID3 Algorithm to construct the tree-structured
  • 154. Youtube/Other Video Links: 1. Machine Learning by Prof. Balaraman ravindran, Department of computer science and engineering,IIT Madras (SWAYAM/NPTEL) https://www.youtube.com/watch?v=fC7V8QsPBec&feature=youtu.be 2. Machine Learning by Prof. Sudeshna Sarkar, Department of computer science and engineering,IIT Kharagpur (NPTEL) https://www.youtube.com/watch?v=EWmCkVfPnJ8&list=PLlGkyYY WOSOsGU-XARWdIFsRAJQkyBrVj&index=2 3. Machine learning UPGRAD course by IIIT,Bangalore https://www.upgrad.com/machine-learning-ai-pgd-iiitb/ 1/22/2023 154 Faculty Video Links, Youtube & NPTEL Video Links and Online Courses Details Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 155. 1. Decision trees are an algorithm for which machine learning task? a) clustering b) dimensionality reduction c) classification d) regression 2. Which error metric is most appropriate for evaluating a {0,1} classification task? a) worst-case error b) sum of squares error c) entropy d) precision and recall 1/22/2023 155 Daily Quiz Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 156. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 156 Daily Quiz 3. A _________ is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. a)Decision tree b) Graphs c) Trees d) Neural Networks 4. Which of the following are the advantage/s of Decision Trees? a)Possible Scenarios can be added b) Use a white box model, If given result is provided by a model c) Worst, best and expected values can be determined for different scenarios d) All of the mentioned
  • 157. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 157 Daily Quiz 5. Which of the following algorithm doesn’t uses learning Rate as of one of its hyperparameter? a) Gradient Boosting b) Extra Trees c) AdaBoost d) Random Forest
  • 158. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 158 Weekly Assignment NOIDA INSTITUTE OF ENGINEERING & TECHNOLOGY, GREATER NOIDA SEMESTER (EVEN) UNIT:- ASSIGNMENT SHEET No. 2 2 Subject Name: - Machine Learning Name of Course Coordinator:- Shweta Mayor Subject Code: - AMTAI0201 (M. Tech) 1.Define Decision tree with example. 2. Explain types of decision tree. 3. Explain Optimizing Decision Tree Performance. 4. Explain Overfitting and Underfitting 5. What is Artificial Intelligence?
  • 159. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 159 Weekly Assignment NOIDA INSTITUTE OF ENGINEERING & TECHNOLOGY, GREATER NOIDA SEMESTER (EVEN) UNIT:- ASSIGNMENT SHEET No. 2 2 6. What is the difference between supervised and unsupervised machine learning? 7. List the different algorithm techniques in Machine Learning 8. Differentiate between supervised, unsupervised, and reinforcement learning. 9. What is perceptron in Machine Learning? 10. What is model accuracy and model performance?
  • 160. 1. Which of the factors affect the performance of learner system does not include? a) Representation scheme used b) Training scenario c) Type of feedback d) Good data structures 2. What is true regarding backpropagation rule? a) it is also called generalized delta rule b) error in output is propagated backwards only to determine weight updates c) there is no feedback of signal at nay stage d) all of the mentioned 1/22/2023 160 MCQ s Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 161. Sub Code: MTCS031 Paper Id: 210201 M. TECH. (SEM-II) THEORY EXAMINATION 2018-19 MACHINE LEARNING Time: 3 Hours Total Marks: 70 Note: Attempt all Sections. If require any missing data; then choose suitably. SECTION A Attempt all questions in brief. 2 x 7 = 14 a. Define Machine Learning? b. Explain regression model. c. What is ANN? d. Explain Well defined learning problems. e. Define Decision tree. f. Explain Bayes classifier. g. Explain Q Learning. 1/22/2023 161 Old Question Papers Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 162. SECTION B Attempt any three of the following: 7 x 3 = 21 a. Explain the role of genetic algorithm in knowledge based technique. b. Differentiate between Genetic algorithm & traditional algorithm with suitable example. c. Explain various ANN architecture in detail. d. Describe any algorithm to implement simulated annealing. e. Explain DBSCAN with its role in forming clusters. SECTION C Attempt any one part of the following: 7 x 1 = 7 (a)Explain back propagation algorithm with suitable example. (b) Explain learning with any two learning techniques with its expression for weight updating. Attempt any one part of the following: 7 x 1 = 7 (a) Write Short Note on followings: (i) Sampling Theory (ii) Bayes Theorem (b) Explain any comparing learning technique with suitable example. 1/22/2023 162 Old Question Papers Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 163. Attempt any one part of the following: 7 x 1 = 7 (a) Explain the followings (i) Generalization (ii) Multilayer Network (b) Describe decision tree learning algorithm with example. Attempt any one part of the following: 7 x 1 = 7 (a) Define the process of designing a learning system. Explain various issues in Machine learning (b) Explain Candidate elimination algorithm in detail. Attempt any one part of the following: 7 x 1 = 7 (a) Explain FOIL in detail. (b) Explain the followings: (i) Hypotheses (ii) Inductive Bias (iii) Perceptron. 1/22/2023 163 Old Question Papers Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 164. 1. Explain the various types of issues in machine learning. 2. Define the learning classifiers. 3. Differentiated between Bayesian Learning and Instance based Learning. 4. Discuss the steps in KNN algorithm and its applications of it. 5. Explain back propagation algorithm and derive expressions for weight update relations. 6. Describe the ID3 Algorithm with a proper example. 1/22/2023 164 Expected Questions for University Exam Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 165. 1/22/2023 165 Summary Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 Perceptron training rule guaranteed to succeed if 1. Training examples are linearly separable 2. Sufficiently small learning rate η Adaline training rule uses gradient descent 1. Guaranteed to converge to hypothesis with minimum 2. Given sufficiently small learning rate η
  • 166. 1/22/2023 166 Summary Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 3. Even when training data contains noise 4. Even when training data not separable by H Problems Slow convergence to local or global minimum
  • 167. 1/22/2023 167 References Reference Books: Introduction to Statistical Learning, Springer, 2013 By Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani. Pattern Classification, 2nd Ed., John Wiley & Sons, 2001, Richard Duda, Peter Hart, David Stork. Machine Learning, McGraw Hill International Edition, by Tom.M.Mitchell. Introduction to Machine Learning, Eastern Economy Edition, Prentice Hall of India, 2005 By Ethern Alpaydin. Pattern Recognition and Machine Learning. Berlin: Springer- Verlag., Bishop, C. Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5
  • 168. 1/22/2023 Dr. Kumod Kr. Gupta Machine Learning (AEC0516) unit-5 168