Overview of Machine Learning Part-2.pptx

Overview of Machine
Learning
Part - 02
DT, RF, KNN, Clustering

Evaluation of models
How to measure if your model is performing good?

Classification Regression
Accuracy,
Precision,
Recall,
F1-score,
Confusion Matrix
MAE,
MSE,
RMSE,
R-squared

Mobile Price Range Prediction Dataset

Mobile Price Prediction Dataset

Accuracy
accuracy = 6/9 = 0.667
or
66.7% accuracy

Confusion Matrix
Output:
Actual Predicted
Positive Positive = True Positive
Positive Negative = False Negative
Negative Positive = False Positive
Negative Negative = True Negative
Cat Not Cat
Cat Not Cat

Confusion Matrix
4 2
2 5
Actual Predicted
Cat Cat
Cat Not Cat
Cat Cat
Not Cat Not Cat
Not Cat Cat
Not Cat Not Cat
Not Cat Not Cat
Cat Cat
Not Cat Cat
Not Cat Not Cat
Cat Not Cat
Cat Cat
Not Cat Not Cat
Performance Quiz
Can you tell the accuracy from the
confusion matrix?

Precision
মডেল যতগুলো পজিটিভ বলতেছে তার মধ্যে কতগুলা আসলেই পজিটিভ?
TP FN
FP TN
Predicted
Actual
Positive Negative
Positive
Negative

Recall
যতগুলো পজিটিভ ছিলো তার মধ্যে কতগুলাকে মডেল পজিটিভ হিসেবে ধরতে পারছে?
TP FN
FP TN
Predicted
Actual
Positive Negative
Positive
Negative

F1 Score
● Why F1 score is better than accuracy, precision or recall?
Terrorist Detection Model —> Accuracy?
In a Model, TP=40, FP=1, FN=20, FN=39 —> Precision?
In a Model, TP=40, FP=20, FN=1, FN=39 —> Recall?
● Why harmonic average, instead of normal average?
let, P=99, R=20,
(P+R)/2 = 59.5 f1-score = 33.277

Decision Tree
● Programmatically → It is a giant structure of nested if-else condition
● Mathematically → uses hyperplanes to cut the coordinate system

Based on gender
Gender Occup. Sugges.
F Student PUBG
F Programmer Github
F Programmer Github
M Programmer Whatsapp
M Student PUBG
M Student PUBG
Based on Occupation
F Student PUBG
M Student PUBG
M Student PUBG
F Programmer Github
F Programmer Github
female male student programmer

Entropy
Measure of Purity/Impurity
E

Entropy
F Student PUBG
F Programmer Github
F Programmer Github
M Student PUBG
M Student PUBG
F Student PUBG
M Student PUBG
M Student PUBG
F Programmer Github
F Programmer Github
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
- 3/3 log 3/3
= 0
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52

Calculating using Information Gain
Information Gain measures the quality of a split.
● Step-1: Calculate Entropy of the parent
E(Parent) = - 1/6 log 1/6 - 2/6 log 2/6 - 3/6 log 3/6 = 1.459
● Step-2: Calculate Entropy of the Children
[done in previous slide]
● Step-3: Calculate Information I of Children

Entropy
F Student PUBG
F Programmer Github
F Programmer Github
M Student PUBG
M Student PUBG
F Student PUBG
M Student PUBG
M Student PUBG
F Programmer Github
F Programmer Github
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
- 3/3 log 3/3
= 0
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
I(Gender) = (3/6 * 0.52) + (3/6 * 0.52)
= 0.52
I(Occupation) = (3/6 * 0) + (3/6 * 0.52)
= 0.26

Calculating using Information Gain
Information Gain measures the quality of a split.
● Step-1: Calculate Entropy of the parent
E(Parent) = - 1/6 log 1/6 - 2/6 log 2/6 - 3/6 log 3/6 = 1.459
● Step-2: Calculate Entropy of the Children
[done in previous slide]
● Step-3: Calculate Information I of Children
● Step-4: Calculate Gain for the Children

Entropy
F Student PUBG
F Programmer Github
F Programmer Github
M Student PUBG
M Student PUBG
F Student PUBG
M Student PUBG
M Student PUBG
F Programmer Github
F Programmer Github
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
- 3/3 log 3/3
= 0
- ⅓ log ⅓ - ⅔ log ⅔
= 0.52
I(Gender) = (3/6 * 0.52) + (3/6 * 0.52)
= 0.52
I(Occupation) = (3/6 * 0) + (3/6 * 0.52)
= 0.26
gain(Gender) = E(Parent) - I(Gender)
= 1.459 - 0.52
= 0.939
gain(Occupation) = E(Parent) - I(Occupation)
= 1.459 - 0.26
= 1.199

Random Forest
● Wisdom of the Crowd
collective opinion of a diverse independent group of individuals
Example: imdb rating, democracy
● Ensemble Learning
collection of multiple machine learning model.
Ensemble method requires variation. Ways to bring variation:
1) Using different models
2) Using same model but different dataset
3) Mixing both of above.

Types of Ensemble Learning
Voting
Dataset1
DT LgR SVM

Stacking: gives priority to the model which is more accurate
Dataset1
DT LgR SVM
Model
w0
w1
w2

Bagging (Bootstrapped Aggregation)
Dataset
Dataset1 Dataset2 Dataset3
DT DT DT

Boosting
Dataset
DT DT DT DT

Random Forest
If all the models in Bagging is Decision tree then it's a random forest.

Out of Bag (OOB) Evaluation
Out of bag samples: that never picked
Dataset = {1,2,3,4,5,6,7,8,9}
● DT1 = {1,3,2,5,6}
● DT2 = {2,9,6,5,2}
● DT3 = {4,1,2,9,4}
sample 7 & 8 is never used. they are out of bag sample. Mathematical experiment
says, 37% samples becomes OOB.
They are used as validation set. because they are never seen by the model.

Model that learns training data
and makes prediction on the
knowledge gained from training
Model that don't learn training
data and use training data only
while making predictions.
Linear Regression,
Logistic Regression,
Decision Tree,
Random Forest
Naive Bayes,
K-Nearest Neighbor

KNN
If a student misses class, as a
teacher whom you will ask the
reason about the absence?
Distance
Voting

KNN
Training Dataset:
Testing Sample:
Height = 172 cm and Weight = 70 kg;
Class=?

Step-1: Calculate Distance from all the training sample
Height Weight Class Distance
160 55 Athlete sqrt((172 - 160)2
+ (70-55)2
) = 19.21
170 65 Athlete sqrt((172 - 170)2
+ (70-65)2
) = 5.38
175 75 Non-Athlete sqrt((172 - 175)2
+ (70-75)2
) = 5.83
180 85 Non-Athlete sqrt((172 - 180)2
+ (70-85)2
) = 17
Nearest?
1
2
3
4

Step-2: Select K-nearest Example and Assign most common class
● K=1, class=?
● K=2, class=?
● K=3, class=?
Athlete
Tie
Non-Athlete

How to resolve tie?
● Reduce the Value of K
● Weighted Voting Based on Distance
● Use a Tiebreaker Rule:
Select the class that occurs most frequently in the entire dataset (global
majority class).

Distance Measure
● Euclidean Distance
● Manhattan Distance
● Minkowski Distance

Clustering
Welcome to unsupervised learning

Clustering
Learning by observation

Types of clustering
Hierarchical Clustering
● Agglomerative
(Bottom up)
● Divisive
(Top down)

Types of clustering
Density Based Clustering
● DBSCAN

Types of clustering
Grid Based Clustering

Types of clustering
Partitioning based Clustering (K-means, PAM, CLARA)

K-means algorithm
v1 v2
1 1.0 1.0
2 1.5 2.0
3 3.0 4.0
4 1.0 3.0
5 3.5 5.0
6 4.5 5.0
7 3.5 4.5

K-means Clustering
Step-1: Take any k sample as centroid of your cluster.
Let k1 = (1, 1) and K2 = (1.5, 2)

K-means clustering
Distance from k1 (1,1) Distance from k2 (1.5, 2)
(1,1) 0 1.11
(1.5, 2) 1.11 0
(3, 4) 3.6 2.5
(1, 3) 2 1.11
(3.5, 5) 4.71 3.6
(4.5, 5) 5.31 4.24
(3.5, 4.5) 4.3 3.2
Step-2: Calculate distance from each centroid to all sample

K-means clustering
Distance from k1
(1,1)
Distance from k2
(1.5, 2)
Assigned cluster
(1,1) 0 1.11 K1
(1.5, 2) 1.11 0 K2
(3, 4) 3.6 2.5 K2
(1, 3) 2 1.11 K2
(3.5, 5) 4.71 3.6 K2
(4.5, 5) 5.31 4.24 K2
(3.5, 4.5) 4.3 3.2 K2
Step-3: For each point, find the nearest Centroid based on distance and assign
the point to that cluster

K-means clustering
Now two clusters are: {(1,1)} and
{(1.5, 2), (3, 4), (1, 3), (3.5, 5), (3.5, 5), (4.5, 5), (3.5, 4.5)}

K-means clustering
Step-4: Calculate new centroids for each
clusters, which is the average of all the
samples of a cluster
for the first cluster, K1 = (1,1)
for the second cluster,
K2 = ( (1.5+3+1+3.5+4.5+3.5)/6,
(2+4+3+5+5+5)/6) = (2.83, 4)
New centroid

K-means clustering
Go to step-1 again with the new centroids, repeat until centroids dont change after
all the 4 steps.
Distance from k1
(1,1)
Distance from k2
(2.83, 4)
Assigned cluster
(1,1) 0 3.51 K1
(1.5, 2) 1.11 2.4 K1
(3, 4) 3.6 0.17 K2
(1, 3) 2 2.08 K1
(3.5, 5) 4.71 1.2 K2
(4.5, 5) 5.31 1.94 K2
(3.5, 4.5) 4.3 0.83 K2

K-means clustering
Now two clusters are: {(1,1), (1.5, 2), (1, 3)} and
{(3, 4), (3.5, 5), (3.5, 5), (4.5, 5), (3.5, 4.5)}

K-means Clustering
New centroids:
for the first cluster
((1+1.5+1)/3, (1+2+3)/3) = (1.66, 2)
for the second cluster
((3+3.5+4.5+3.5)/4, (4+5+5+4.5)/4)
= (3.62, 4.62)
Do yourself:
Do the same process again with the
new centroids and see if the
centroids changes anymore

Overview of Machine Learning Part-2.pptx

Overview of Machine Learning Part-2.pptx

More Related Content

Similar to Overview of Machine Learning Part-2.pptx

Recently uploaded

Overview of Machine Learning Part-2.pptx