Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
1. SATELLITE IMAGE
CLASSIFICATION
USING K-NN, SVM,
AND DECISION TREE
1. I Gede B. P. (P66077042)
2. Umroh Dian S. (P66077050)
3. Iva N. (P66067021)
4. M. Irsyadi F. (P66067055)
9. 9
K-NN
• K-Nearest-Neighbor algorithm is a method for
classifying objects based on closest training label in the
feature space.
• K value which gives the minimum error rate may be
selected for K-Nearest Neighbor classification.
• Distance function for K-Nearest-Neighbor is Euclidean
distance. It uses distance based comparison to assign
equal weight to each attribute. They can suffer from
poor accuracy when noisy and irrelevant attributes are
given.
• It is classifying the pattern by comparing a given test
pattern with training pattern that are similar to it. It is
widely used in pattern recognition.
(K-Nearest-Neighbor)
Sounds
Claws
?
K=3
10. 10
SVM
• SVM classification uses different planes in space to
divide data points.
• It gains flexibility in the choice of threshold and handles
more input data very efficiently.
• Its performance and accuracy depend upon the
selection of hyper plane and kernel parameter.
• The goal of SVM Classification is to produce a model,
based on the training data, which will be able to predict
class labels of the test data accurately
(Support Vector Machine)
CupcakeMuffin
Cupcakes are topped with
creamy, delicious frosting.
Muffins may have a sugared
top or a very thin glaze.
VS
10
14. 14
DT
• Decision tree consist of mainly three parts: Partitioning
the nodes, find the terminal nodes and allocate class
label to terminal nodes.
• It is based on hierarchical rule. It handles high
dimensional data and representation of knowledge in
tree form which is easy to humans for understanding
purpose.
• When decision tree built, many of branches reflects
noise in the training pattern so, tree pruning attempts to
identify and remove such branches and improve the
accuracy of classification.
(Decision Tree)
16. 16
Data
High Resolution
Low Resolution Landsat 8
North Taiwan ; 2018
Image size: 4300 x 4300 pixels
Pleiades
Colorado, USA ; 2012
Image size: 1300 x 1300 pixels
ENVI 5.3
Processing
17. High Resolution
Training and Testing Sample
Datasets
• The red color = building region
• The green color = vegetation region
• The blue color = road region
• The yellow color = concrete region
Land Cover Training Area (Segment)
Vegetation 1726
Building 486
Road 72
Concrete 420
17
19. 19
High Resolution
Ground Truth
• Digitation
• Manual Interpretation
• Divided into 4 classes:
1.Building
2.Road
3.Vegetation
4.Concrete
20. Low Resolution
Training and Testing Sample
Datasets
• The red color = urban region
• The green color = vegetation region
• The blue color = water region
Land Cover Training Area (Segment)
Vegetation 11275
Water 98
Urban 10964
20
30. 30
Accuracy Assessment and Comparisons
Overall Accuracy (%) SVM DT KNN
High Resolution 78.60 68.41 76.26
Low Resolution 83.30 59.08 82.34
In high resolution, the classification accuracy of SVM and DT were
significantly different. However, the classification accuracy of SVM
and KNN were not significantly different.
SVM always showed the most accurate results, followed by
Decision Tree and KNN.
31. 31
0
10
20
30
40
50
60
70
80
90
0 1000 2000 3000
OverallAccuracy(%)
Training Sample
Comparing the Performance of
Training Sample
SVM
KNN
Training Sample Comparison
SVM has flexibility in choice of threshold than both method.
SVM K-NN
Training Data Overall Accuracy (%) Difference (%)
2704 78.6
2.24
1354 76.36
3.23
238 73.13
0.03157 73.16
Training Data Overall Accuracy (%) Difference (%)
2704 76.26
2.78
1354 73.48
7.82
238 65.66
11.56157 54.1
32. 32
The Impact of Distance in Training Sample
Land Cover Training Area (Segment)
Vegetation 1726
Building 486
Road 72
Concrete 420
Land Cover Training Area (Segment)
Vegetation 77
Building 38
Road 4
Concrete 38
Classifier Method Overall Accuracy (%)
SVM 78.6
KNN 76.26
Classifier Method Overall Accuracy (%)
SVM 73.16
KNN 54.1
The distance of training sample have significant impact on the classification result.
The closest distance will produce a better result, especially on the KNN method.
34. 34
Conclusion
• The SVM method has best accuracy compared to the Decision
Tree and k-Nearest Neighbor methods.
• The value of Kappa coefficient in the SVM method has a high
value compared to both methods.
• The sample size of training samples has more impact on the
classification accuracy for KNN and DT than for SVM. In
addition, SVM has flexibility in choice of threshold than both
method.
• The distance in training sample affect on the classification
result. Especially in the KNN method, we can see the big
difference of overall accuracy based on the number of used
training data.
Example of k-NN classification. The test sample (green circle) should be classified either to the first class of blue squares or to the second class of red triangles. If k = 3 (solid line circle) it is assigned to the second class because there are 2 triangles and only 1 square inside the inner circle. If k = 5 (dashed line circle) it is assigned to the first class (3 squares vs. 2 triangles inside the outer circle).