Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor

SATELLITE IMAGE
CLASSIFICATION
USING K-NN, SVM,
AND DECISION TREE
1. I Gede B. P. (P66077042)
2. Umroh Dian S. (P66077050)
3. Iva N. (P66067021)
4. M. Irsyadi F. (P66067055)

The Outlines
• INTRODUCTION
• METHODOLOGY
• DATA AND PROCESSING
• RESULTS AND ANALYSIS
• CONCLUSION
2

4
Manual Method ?
High Resolution:
Pleiades Image
Low Resolution:
Landsat 8 Image

5
Pleiades
• K-NN
• SVM
• DT
Landsat 8
• K-NN
• SVM
• DT
The purposes:
To find the suitable method for each image
by considering the accuracy.

Proposed Flowchart
High Resolution:
Pleiades Image
Low Resolution:
Landsat 8 Image
8

9
K-NN
• K-Nearest-Neighbor algorithm is a method for
classifying objects based on closest training label in the
feature space.
• K value which gives the minimum error rate may be
selected for K-Nearest Neighbor classification.
• Distance function for K-Nearest-Neighbor is Euclidean
distance. It uses distance based comparison to assign
equal weight to each attribute. They can suffer from
poor accuracy when noisy and irrelevant attributes are
given.
• It is classifying the pattern by comparing a given test
pattern with training pattern that are similar to it. It is
widely used in pattern recognition.
(K-Nearest-Neighbor)
Sounds
Claws
?
K=3

10
SVM
• SVM classification uses different planes in space to
divide data points.
• It gains flexibility in the choice of threshold and handles
more input data very efficiently.
• Its performance and accuracy depend upon the
selection of hyper plane and kernel parameter.
• The goal of SVM Classification is to produce a model,
based on the training data, which will be able to predict
class labels of the test data accurately
(Support Vector Machine)
CupcakeMuffin
Cupcakes are topped with
creamy, delicious frosting.
Muffins may have a sugared
top or a very thin glaze.
VS
10

11
SVM
Muffins
Cupcakes

12
SVM

13
SVM
Muffins
Cupcakes

14
DT
• Decision tree consist of mainly three parts: Partitioning
the nodes, find the terminal nodes and allocate class
label to terminal nodes.
• It is based on hierarchical rule. It handles high
dimensional data and representation of knowledge in
tree form which is easy to humans for understanding
purpose.
• When decision tree built, many of branches reflects
noise in the training pattern so, tree pruning attempts to
identify and remove such branches and improve the
accuracy of classification.
(Decision Tree)

16
Data
High Resolution
Low Resolution Landsat 8
North Taiwan ; 2018
Image size: 4300 x 4300 pixels
Pleiades
Colorado, USA ; 2012
Image size: 1300 x 1300 pixels
ENVI 5.3
Processing

High Resolution
Training and Testing Sample
Datasets
• The red color = building region
• The green color = vegetation region
• The blue color = road region
• The yellow color = concrete region
Land Cover Training Area (Segment)
Vegetation 1726
Building 486
Road 72
Concrete 420
17

SVM
Classification
DT
Classification
KNN
Classification
18Vegetation Building Road Concrete

19
High Resolution
Ground Truth
• Digitation
• Manual Interpretation
• Divided into 4 classes:
1.Building
2.Road
3.Vegetation
4.Concrete

Low Resolution
Training and Testing Sample
Datasets
• The red color = urban region
• The green color = vegetation region
• The blue color = water region
Vegetation 11275
Water 98
Urban 10964
20

SVM
Classification
DT
Classification
KNN
Classification
21Vegetation Building Water

22
Low Resolution
Ground Truth
• Digitation
• Manual Interpretation
• Divided into 3 classes:
1.Urban Area
2.Vegetation
3.Water

24
SVM
Classification
ROI
High Resolution
SVM
Overall Accuracy = 78.60%
Kappa Coefficient = 0.5967
Ground Truth (Percent)
Class Building Road Vegetation Concrete
Building 87.4 1.36 16.82 0.46
Road 0.07 91.47 7.55 0.78
Vegetation 11.74 5.04 74.59 2.41
Concrete 0.79 2.14 1.04 96.36

25
k-NN
Classification
ROI
High Resolution
k-NN
Class Building Road Vegetation Concrete
Building 87.37 1.09 18.37 0.62
Road 0.24 93.56 8.58 0.81
Vegetation 11.12 2.92 71.09 2.36
Concrete 1.27 2.43 1.95 96.2

26
DT
Classification
ROI
High Resolution
DT
Class Vegetation Road Building Concrete
Vegetation 68.86 6.97 10.79 4.01
Road 7.31 61.89 18.27 0.62
Building 23.75 27.4 68.12 0.12
Concrete 0.09 3.74 2.81 95.24

27
Class Vegetation Water Urban
Vegetation 97.22 0.09 50.42
Water 0.19 99.56 10.14
Urban 2.59 0.35 39.43
SVM
Classification
ROI
Low Resolution
SVM

28
k-NN
Classification
ROI
Low Resolution
k-NN
Class Vegetation Water Urban
Vegetation 92.87 0.07 46.53
Water 0.22 99.54 10.18
Urban 4.79 0.34 42.53

29
DT
Classification
ROI
Low Resolution
DT
Class Water Urban Vegetation
Water 49.48 8.36 0.07
Urban 1.02 37.1 9.25
Vegetation 0.02 46.18 90.6

30
Accuracy Assessment and Comparisons
Overall Accuracy (%) SVM DT KNN
High Resolution 78.60 68.41 76.26
Low Resolution 83.30 59.08 82.34
In high resolution, the classification accuracy of SVM and DT were
significantly different. However, the classification accuracy of SVM
and KNN were not significantly different.
SVM always showed the most accurate results, followed by
Decision Tree and KNN.

31
0
10
20
30
40
50
60
70
80
90
0 1000 2000 3000
OverallAccuracy(%)
Training Sample
Comparing the Performance of
Training Sample
SVM
KNN
Training Sample Comparison
SVM has flexibility in choice of threshold than both method.
SVM K-NN
Training Data Overall Accuracy (%) Difference (%)
2704 78.6
2.24
1354 76.36
3.23
238 73.13
0.03157 73.16
Training Data Overall Accuracy (%) Difference (%)
2704 76.26
2.78
1354 73.48
7.82
238 65.66
11.56157 54.1

32
The Impact of Distance in Training Sample
Vegetation 1726
Building 486
Road 72
Concrete 420
Vegetation 77
Building 38
Road 4
Concrete 38
Classifier Method Overall Accuracy (%)
SVM 78.6
KNN 76.26
Classifier Method Overall Accuracy (%)
SVM 73.16
KNN 54.1
The distance of training sample have significant impact on the classification result.
The closest distance will produce a better result, especially on the KNN method.

34
Conclusion
• The SVM method has best accuracy compared to the Decision
Tree and k-Nearest Neighbor methods.
• The value of Kappa coefficient in the SVM method has a high
value compared to both methods.
• The sample size of training samples has more impact on the
classification accuracy for KNN and DT than for SVM. In
addition, SVM has flexibility in choice of threshold than both
method.
• The distance in training sample affect on the classification
result. Especially in the KNN method, we can see the big
difference of overall accuracy based on the number of used
training data.

Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor

Similar to Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor (20)

More from National Cheng Kung University

More from National Cheng Kung University (20)

Recently uploaded

Recently uploaded (20)

Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor

Editor's Notes