IEEE Presentation.pptx

Machine Learning
Dr Debabrata Swain
Assistant Professor (Senior Grade)
CSE Department
Pandit Deendayal Energy University
Gandhinagar, Gujarat

Machine Learning
• Machine learning is a method of data analysis that automates analytical
model building. It is a branch of artificial intelligence based on the idea
that systems can learn from data, identify patterns and make decisions with
minimal human intervention.

Traditional Computing vs Machine Learning

Types of Learning
• Supervised
• Unsupervised

Different Classification Algorithm
• Logistic Regression
• Knn Algorithm
• Decision Tree
• Random Forest

Logistic Regression
• It is a Machine Learning Algorithm used for
Binary as well as Multiclass Classification.

K-NN Algorithm
• It is a classification algorithm.
• For this algorithm we need to have some
examples with correct groups called reference
record.
• For classifying an unknown point it finds the
distance from all the points and looks for K-
nearest points.
• The class having majority is assigned to the
unknown record.

Name Age Gender Sport
Ajay 32 M Football
Mark 40 M Neither
Sara 16 F Cricket
Zaira 34 F Cricket
Sachin 55 M Neither
Rahul 40 M Cricket
Pooja 20 F Neither
Smith 15 M Cricket
Laxmi 55 F Football
Michael 15 M Football

• Before Training the model we have to pre-
process the data.
• The gender column is carrying discrete data
which is non-numeric.
• We have to convert it to numeric categorical
values Male=0, Female=1.

Name Age Gender Sport
Ajay 32 0 Football
Mark 40 0 Neither
Sara 16 1 Cricket
Zaira 34 1 Cricket
Sachin 55 0 Neither
Rahul 40 0 Cricket
Pooja 20 1 Neither
Smith 15 0 Cricket
Laxmi 55 1 Football
Michael 15 0 Football

• Assume K=3 (Number of Neighbours)
• What will be the class of the test Record?
Name- Anjelina
Age- 5
Gender- Female (1)
• To find the distance between the records distance formula is
used.

• Distance between Ajay and Anjelina=
(5 − 32)2+(1 − 0)2
= 27.02

Name Age Gender Sport Distance
Ajay 32 0 Football 27.02
Mark 40 0 Neither 35.01
Sara 16 1 Cricket 11
Zaira 34 1 Cricket 9
Sachin 55 0 Neither 50.01
Rahul 40 0 Cricket 35.01
Pooja 20 1 Neither 15
Smith 15 0 Cricket 10
Laxmi 55 1 Football 50
Michael 15 0 Football 10.05

The most nearest 3 records are-
• Zaira (9)- Cricket
• Smith (10)- Cricket
• Michael (10.5)- Football
Anjelina will have the class- Cricket (Majority)

Decision Tree
• When dataset is easily separable then we can
use logistic regression to draw decision
boundary for classification.

• But when the data points are not linearly
separable or complex then we can not use
logistic regression for classification.

• In such kind of problems decision tree
algorithm is used to draw different decision
boundaries.

Random Forest
• It is a collection of different decision trees.
• Each decision tree are trained using random
batch of records out of the total records.
• Now a number of classifiers we have. So when
we want to find the class of an unknown
record then all individual classifiers will give
their decision and finally voting algorithm is
used to select the class having majority votes.

IEEE Presentation.pptx

More Related Content

Similar to IEEE Presentation.pptx

Recently uploaded

IEEE Presentation.pptx