The document presents a novel biometric-based approach for fish species classification using computer vision techniques. The approach extracts texture features using Weber's Local Descriptor (WLD) and color features using color moments from fish images. Linear Discriminant Analysis (LDA) is used to reduce the high dimensionality of WLD features. An AdaBoost classifier is then used to identify fish species based on the extracted features, achieving an accuracy of approximately 96.4% on a dataset of four fish species. The model is also tested on rotated and translated images with promising results.
2. namely: Argyrosomus regius, Sardinella maderensis, Scomberomorus com-
merson, and Trachinotus ovatus. These four species were selected be-
cause they have specific nutritional and functional importance; hence,
these species are common in Egypt and different areas in the world. The
Weber's Local Descriptor (WLD) and color moments methods were
adopted to extract texture and color features, respectively. Due to the
high dimensionality of the WLD features, Linear Discriminant Analysis
(LDA) technique was used to reduce the number of features and to in-
crease the separability between different classes. The class label of an
unknown sample can be predicted using the AdaBoost classifier, which
matches the features of the unknown sample with the features of la-
beled or training data.
The rest of the paper is organized as follows: Section 2 summarizes
the related work of the fish identification system based on information
technology. Section 3 gives overviews of the techniques and methods
that are used for the proposed approach while Section 4 summarizes
our approach in details. Experimental results and discussions are pre-
sented in Section 5. Finally, conclusions are summarized in Section 6.
2. Related work
There are many automated fish identification systems have been
proposed (Iscsmen et al., 2014; Hnin and Lynn, 2016; Shafait et al.,
2016). Cadieux et al. proposed an intelligent system for automated fish
sorting and counting (Cadieux et al., 2000). They employed the Neural
Networks (NN) for classification, and for the features, they utilized
some shape features including moment invariants, Fourier descriptors,
and nine shape features, and they achieved an accuracy that ranged
from 70.8% to 72.7%. In another research, Lee et al. introduced a
system for fish recognition and migration monitoring (Lee et al., 2004).
Their system depends on extracting a shape and then applies shape
matching for fish recognition. They matched the whole shape and
several shape descriptors, such as Fourier descriptors, polygon ap-
proximation, and line segments, were tested, and they revealed accu-
racy near to 60%. Rova et al. used the deformable template matching to
extract texture features and Support Vector Machine (SVM) for classi-
fication, and their model revealed 90% accuracy rate (Rova et al.,
2007). Instead of using one type of features as in Cadieux et al. (2000);
Lee et al. (2004); Rova et al. (2007), our model combined the color and
texture features.
Chamba et al. extracted 85 features, which consists of geometric
features (e.g. area, perimeter, and elongation), color features (e.g. hue,
gray levels, and color histograms), motion features, and texture features
(e.g. entropy and correlation). They also employed the quadratic Bayes
classifier for classification, and they achieved 85.77% accuracy rate
(Chambah et al., 2003). Spampinato et al. classified 320 fish images
which were collected from 10 different classes and they revealed 92%
accuracy rate (Spampinato et al., 2010). They extracted texture and
shape features such as Gabor features and Fourier descriptors, and they
employed discriminant analysis classifier. Larsen et al. also used shape
and texture features with linear discriminant classifier to classify three
species and they obtained a recognition rate of 76% (Larsen et al.,
2009). Three different biometric techniques (Euclidean network tech-
nique, quadratic network technique, triangulation technique) were
employed with Naive Bayesian classifier in Iscsmen et al. (2014), and
the accuracy was 93.10% for seven species and 75.71% for 15 species.
Due to the uncontrollable environment, fish species were also classified
using video images as in Shafait et al. (2016). They aimed to classify
and track fish in a real environment. In all mentioned studies, one
single classifier was used. However, our model employed the AdaBoost
classifier which is based on combining the outputs of different single
classifiers to improve the classification robustness.
3. Preliminaries
This section provides overviews of the algorithms and methods that
were used in the design of the proposed model.
3.1. Weber's Local Descriptor (WLD)
WLD is an image descriptor method which describes the image as a
histogram of Differential Excitation (ξj) and Orientation (ϕt) (Chen
et al., 2010; Gaber et al., 2016). WLD is originally inspired by Weber's
Law that was proposed by Ernst Weber in the 19th century, where the
ratio between an increment threshold and the background intensity is
constant, and this law can be formally expressed as follows:
=
I
I
k
Δ
(1)
where ΔI is the increment threshold, I represents the initial intensity
or an image background, k is the constant value even if I is changing,
and the fraction I
I
Δ
is the Weber law or Weber fraction (Chen et al., 2010).
In the WLD method, the features are extracted from each pixel in an
image. The WLD algorithm has three main steps (1) finding differential
excitations, (2) calculating the gradient orientations, and (3) building
the histogram. In the first step of the WLD algorithm, the differential
excitation of the image is computed for each pixel in the input image,
and the gradient orientation is then calculated. In the third step of the
WLD algorithm, the differential excitation and gradient orientation are
combined to form a WLD histogram (Chen et al., 2010; Gaber et al.,
2016). More details about each step are explained below.
3.1.1. Differential excitation
In this step, the differential excitation (ξ) for each pixel is calcu-
lated. First, the differences between the center pixel xc and its sur-
rounding neighbors are calculated as follows:
∑ ∑
= = −
=
−
=
−
ν x x x
(Δ ) ( )
s
i
p
i
i
p
i c
00
0
1
0
1
(2)
where p is the number of neighbors and xi(i = 0, …, p − 1) is the
intensity of the ith neighbor of xc. Fig. 1 shows an illustrative example
to calculate the differential excitation where the number of neighbors
for xc was eight, i.e., p = 8. The number of neighbors or the patch/
window size is a user defined parameter. As shown in the figure, four
filters, f00, f01, f10, and f11, are used to calculate ν ν ν
, ,
s s s
00 01 10
, and νs
11
,
respectively, where, νs
00
is the difference between xc and its neighbors,
=
ν x
s c
01
, = −
ν x x
s
10
5 1, and = −
ν x x
s
11
7 3. The ratio between the differ-
ences νs
00
and νs
01
is then calculated as follows, =
G x ν ν
( ) /
c s s
ratio
00 01
. The
arc-tangent function is then applied on Gratio(·) to get the differential
excitation of (xc), as in Eq. (3).
∑ ⎜ ⎟
= =
⎡
⎣
⎢
⎛
⎝
− ⎞
⎠
⎤
⎦
⎥
=
−
ξ x G G x
x x
x
( ) [ ( )] arctan
c c
i
p
i c
c
arctan ratio
0
1
(3)
3.1.2. Orientation
In this step, the orientation of the current pixel (xc) is computed by
calculating the differences in the horizontal and vertical directions as
follows:
⎜ ⎟ ⎜ ⎟
= ⎛
⎝
⎞
⎠
= ⎛
⎝
−
−
⎞
⎠
θ x
ν
ν
x x
x x
( ) arctan arctan
c
s
s
11
10
7 3
5 1 (4)
The gradient orientation is then quantized by transforming it into T
dominant orientation, where θ is mapped to θ́ as follows (Gong et al.,
2011):
= +
θ ν ν π
´ arctan2( , )
s s
11 10
(5)
where
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
325
3. =
⎧
⎨
⎪
⎩
⎪
> >
− > <
− < <
− < >
ν ν
θ ν ν
π θ ν ν
θ π ν ν
θ ν ν
arctan2( , )
, 0 and 0
, 0 and 0
, 0 and 0
, 0 and 0
s s
s s
s s
s s
s s
11 10
11 10
11 10
11 10
11 10
(6)
where θ ∈ [−π/2, π/2] and ∈
θ π
´ [0, 2 ].
The quantization function is calculated as follows:
= = =
⎛
⎝
⎜
⎢
⎣
⎢ +
⎥
⎦
⎥
⎞
⎠
⎟
ϕ f θ
t θ
T
π t θ
θ
π T
T
( ´)
2 ( ´)
, and ( ´) mod
´
2 /
0.5 ,
t q
(7)
3.1.3. WLD histogram
After calculating the Differential Excitation (ξj) and Orientation (ϕt)
at each pixel (as described in Sections 3.1.1 and 3.1.2), the WLD his-
togram is computed. This histogram consists of (ξj, ϕt), j = 0, …, N − 1
and t = 0, 1, …, T − 1, where N is the dimensionality of an image and T
represents the number of dominant orientations (Chen et al., 2008;
Gong et al., 2011). The steps of WLD algorithm are summarized in
Algorithm 1.
Algorithm 1. WLD algorithm.
∑ ⎜ ⎟
= = ⎛
⎝
⎞
⎠
=
−
G x
ν
ν
x
x
( )
Δ
c
s
s i
p
i
c
ratio
00
01
0
1
(8)
1: Given the size of the patch, e.g. 3 × 3, 5 × 5, 7 × 7, etc.
2: for all pixels in an image do
3: Calculate = ∑ = ∑ −
=
−
=
−
ν x x x
(Δ ) ( )
s i
p
i i
p
i c
00
0
1
0
1
.
4: Compute Gratio as follows:
∑ ⎜ ⎟
= = ⎛
⎝
⎞
⎠
=
−
G x
ν
ν
x
x
( )
Δ
c
s
s i
p
i
c
ratio
00
01
0
1
(8)
5: Calculate the differential excitation as follows:
= = ⎡
⎣
∑ ⎤
⎦
= ⎡
⎣
∑ ⎤
⎦
=
−
=
− −
( )
( )
ξ x G
( ) arctan( ) arctan
arctan
c i
p x
x
i
p x x
x
ratio 0
1 Δ
0
1
i
c
i c
c (9)
6: end for
7: for all pixels in an image do
8: Calculate the differences in horizontal and vertical directions of
xc as follows:
= ⎡
⎣
⎢
⎤
⎦
⎥
= ⎡
⎣
⎢
−
−
⎤
⎦
⎥
θ x
ν
ν
x x
x x
( ) arctan arctan
c
s
s
11
10
7 3
5 1 (10)
9: Transform ∈ ⎡
⎣
− ⎤
⎦
θ ,
π π
2 2
to ∈
θ π
´ [0, 2 ] as follows:
= +
θ ν ν π
´ arctan2( , )
s s
11 10
(11)
where ν ν
arctan2( , )
s s
11 10
is calculated as in Eq. (6).
10: Compute the quantization function as follows, ϕt = (2t/T)π.
11: end for
Fig. 1. Illustration of the computation of the WLD algorithm.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
326
4. 12: Calculate the WLD histogram (WLD(ξj, ϕt)), where j = 0, …,
N − 1, t = 0, 1, …, T − 1.
3.2. Color features
Color feature algorithms are widely used in different applications
due to their robustness against different transformations such as rota-
tion and scaling (Ohta et al., 1980; Tharwat et al., 2016b). In this paper,
color moments are used to extract color features. Color moments consist
of the first order, i.e., mean, second order, i.e., variance, and third
order, i.e., skewness (Ohta et al., 1980). The first color moment of the
kth color component (k = 1, 2, 3) is indicated by:
∑ ∑
=
= =
M f x y
1
XY
( , ),
k
x
X
y
Y
k
1
1 1 (12)
where XY is the total number of pixels of the image and fk(x, y)
represents the color value of the kth color component of the image pixel
(x, y). The hth moment of kth color component can be calculated as
follows (Tharwat et al., 2016b):
∑ ∑
= − = …
= =
M
h
f x y M h
XY
( ( , ) ) , 2, 3,
k
h
x
X
y
Y
k k
1 1
1
h
1
(13)
3.3. Linear Discriminant Analysis (LDA)
High dimensional data needs high computational efforts. However,
redundant features in the data have a negative impact on the classifi-
cation performance. Dimensionality reduction algorithms are used for
reducing the number of features such as Principal Component Analysis
(PCA) (Tharwat, 2016b), LDA (Tharwat, 2016a), and Independent
Component Analysis (ICA) (Hyvärinen and Oja, 2000). LDA is one of
the well-known dimensionality reduction methods that are used in
machine learning. The aim of LDA is to find a linear combination of
features which linearly separates two or more classes (Baudat and
Anouar, 2000; Tharwat et al., 2017). This goal can be achieved by
finding a transformation matrix W that maximizes the ratio of the be-
tween-class variance ( = ∑ − −
=
S μ μ μ μ
( )( )
b j
c
j j
T
1
) to the within-class
variance ( = ∑ ∑ − −
= =
S x μ x μ
( )( )
w j
c
i
N
i
j
j i
j
j
T
1 1
j
), where xi
j
is the ith
sample of class j, μj is the mean of class j, c is the number of classes, μ
represents the mean of all classes, and Nj is the number of samples in
class j (Baudat and Anouar, 2000). This ratio is called Fisher's formula:
=
J W
W S W
W S W
( )
T
b
T
w (14)
The solution of Fisher's formula is a set of eigenvectors (V) and ei-
genvalues (λ) of W; and the LDA space consists of the eigenvectors
which have highest eigenvalues.
In our proposed model, the LDA method is used to reduce the
number of features and to discriminate between different classes, where
the classes represent fish species.
3.4. AdaBoost classifier
AdaBoost is a classifier ensemble algorithm, and it consists of a
number of single classifiers. A single classifier is a simple, easy to im-
plement, and fast classifier such as single level decision trees
(Kuncheva, 2014). The goal of an ensemble classifier is to train its
single classifiers and then combine the weighted single classifiers to
construct a strong classifier.
The AdaBoost learning algorithm is made up of two phases, namely,
training and testing phases. In the training phase, the weights of all
samples (w) are equal as in Algorithm 2. For each iteration (t), the
weights are adjusted, and the samples are selected based on their
weights to train the current single classifier (Ct), and the error rate or
misclassification rate (ϵt) for Ct is then calculated. If the ϵ > 50%, the
weights are reinitialized, and the error rate is calculated again. The
weight for Ct is then calculated as in step number 10, increasing the
error rate increases the weight of the single learner (αt). In the last step,
the weights of the samples are updated as in the eleventh step of
Algorithm 2. In this step, if the jth sample is misclassified then =
l 1
j
t
;
otherwise, =
l 0
j
t
. Since the weight of the single classifier (αi) is less than
one; thus, the new weights ( +
wj
t 1
) of the correctly classified samples will
be decreased; otherwise, the weights will be increased. To sum up, the
AdaBoost classifier focuses on the misclassified samples and the pro-
cedure is repeated for many iterations until the performance is satisfied
(Kuncheva, 2014; Gaber et al., 2016).
Algorithm 2. AdaBoost classifier.
=
−
α
ϵ
1 ϵ
t
t
t (15)
1: Given a training set X = {(x1, y1), …, (xN, yN)}, where yi is the
class label of xi ∈ X sample and N denotes the total number of
samples in the training set.
2: Initialize the weights, = … ∈ ∑ =
=
w w w w w
[ , , ], [0], 1
i i
N
i
j
i
j
N
j
i
1 1
.
Usually the weights are initialized to be equal as follows,
= = …
w j N
, 1, ,
j N
1 1
.
3: for t = 1 to T do
4: Take a sample Dt from X using distribution wt.
5: Use Dt to train the single classifier Ct with a minimum error
rate (ϵt), where = ∑ =
w l
ϵt j
N
j
t
j
t
1
, and =
l 1
j
t
if Ct misclassifies xj;
otherwise, =
l 0
j
t
.
6: while ϵt > =0.5 do
7: Reinitialize the weights to = = …
w j N
, 1, ,
j
t
N
1
.
8: Calculate ϵt again using the equation in the fifth step.
9: end while
10: Compute the weight of each single classifier (αt) as follow:
=
−
α
ϵ
1 ϵ
t
t
t (15)
11: Update the weights of the training samples to be used in the
next iteration (t + 1) as follows:
=
∑
= …
+
−
=
−
w
w α
w α
j N
, 1, ,
j
t j
t
t
l
i
N
i
t
t
l
1
(1 )
1
(1 )
j
t
i
t
(16)
12: end for
In the testing phase, all single classifiers of the AdaBoost classifier
are used to classify an unknown sample (xtest) as follows:
∑ ⎜ ⎟
= ⎛
⎝
⎞
⎠
∀ = …
=
μ
α
t T
ln
1
, 1, 2, ,
t
C x ω t
( )
t t
test (17)
where T is the maximum number of iterations and μt indicates the
score of class ωt. The score of each class is calculated and the AdaBoost
classifier assigns the class with a maximum score to the unknown
sample.
4. Proposed fish identification system
The proposed approach consists of four phases: data collection,
feature extraction, feature reduction, and classification. These phases
are explained below.
4.1. Data collection
The dataset contains four different fish species, namely: Argyrosomus
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
327
5. regius, Sardinella maderensis, Scomberomorus commerson, and Trachinotus
ovatus. These four species of fish are common. Moreover, these species
have specific nutritional and functional importance such as promoting
heart health and bone health. The collected dataset consists of 241 color
images that were collected from El Brulus Kafr El-Sheikh – Egypt, using
a Fujifilm X-T10 Camera,2
and were stored in Jpeg format. The pro-
posed system that was used to collect our dataset is illustrated in Fig. 2.
As shown, the images were captured from different angles.
Table 1 summarizes the number of samples, minimum dimensions,
and maximum dimensions for each class, and Fig. 3 shows samples of
the collected data. As shown in Fig. 3, the images were captured from
different distances with different illumination and lighting conditions
which affect the dimension of images as listed in Table 1. Samples of the
four classes can be characterized as follows:
• Argyrosomus regius (class 1): has a large head with a long body. Its
mouth is in terminal position without barbells, and it has small eyes.
Along the caudal fin, the lateral line is extending. Moreover, the first
dorsal fin is shorter than the second, and the anal fin contains short
and very thin spiny rays. In addition, the body color of this class is
silver-gray with bronze in the dorsal part of the body, the base of the
fin is reddish brown in color, the mouth cavity is yellow-gold and it
is brownish at postmortem examination. The length of this type of
fish reaches to 2 m and 50 kg in weight.
• Sardinella maderensis or Madeiran sardinella (class 2): this class is
found in the Atlantic Ocean and Mediterranean Sea. The color of this
class is silver with gray caudal fins with sometimes black tips with
protruding belly. Moreover, a black membrane is found in between
the white upper pectoral fin rays. This class forms schools in coastal
waters.
• Scomberomorus commerson (class 3): this type of fish is found in Asia,
the east coast of Africa, the Middle East, along the northern coastal
areas of the Indian Ocean, and as far east as the South West Pacific
Ocean. The color of this type is blue to dark gray along their backs
and flanks, and the belly is silver with bluish gray color.
• Trachinotus ovatus (class 4): has an elongated and compressed body
with dorsal spines. The color of its back is greenish-gray with silver
sides. The dorsal, anal, and caudal fin lobes are black-tipped.
Additionally, the length of the second dorsal fin and anal fin are
equal to each other, and the lateral line is arched over pectoral fins.
4.2. Feature extraction phase
In this phase, the WLD and color moments methods have been
adopted to extract texture and color features, respectively, as shown in
Fig. 4.
In our proposed model, WLD is applied because it has many features
that are not found in many other widely used feature extraction
methods such as Local Binary Patterns (LBP) (Ahonen et al., 2006) and
Scale Invariant Feature Transformation (SIFT) (Lowe, 1999). For ex-
ample, WLD extracts features from each pixel as in LBP; however, WLD
is robust against image rotation than LBP. This is because LBP builds
statistics on the local patterns while the WLD algorithm computes
salient patterns and then builds statistics on these patterns with the
gradient orientation of the current pixel. Moreover, LBP compares all
pixels with their surrounding pixels while WLD calculates the ratio of
the intensity differences to the current pixel (Gaber et al., 2016). Hence,
WLD is more robust against noisy pixels and illumination changes as
reported in Chen et al. (2010). Moreover, WLD has only one parameter
(patch size) that needs to be tuned while SIFT has many parameters
such as peak threshold, the number of angles, the number of bins, and
levels of scale space which need to be tuned (Lowe, 1999). Moreover,
the time complexity of SIFT is O(C1(αβ)mn + C2k1 + C3k2st + C4k2st),
where C1, C2, C3, and C4 represent four constants, k1 indicates the
number of keypoint candidates, k2 represents the number of keypoints,
s and t represent the size of the support regions for each keypoint, and α
and β refer to the levels of octave, i.e., scale space, and scales of each
octave, respectively, m and n are the dimensions of the image. While the
complexity of WLD is O(C1mn), where C1 is a constant and it represents
the computation of each pixel in WLD. As a consequence, WLD is much
faster than SIFT (Chen et al., 2010).
Due to the high dimensionality of the WLD algorithm, the LDA
method has been applied used (see Section 4.3). However, despite the
fact that the number of color features is small, the LDA algorithm was
not used to reduce the number of color features, but to project these
features onto the LDA space which increases the separability between
different classes.
The features, i.e., color and WLD features, are then normalized/
scaled using Zscore normalization (Jain et al., 2005). This normalization
step was applied to make the features from different sources or algo-
rithms, as in our study, compatible. In ZScore normalization, the ar-
ithmetic mean (μ) and standard deviation (σ) of the given features (fold)
are calculated and new features (fnew) will be calculated as in Eq. (18).
The normalized features are then combined together to form a single
feature vector for each sample.
=
−
f
f μ
σ
new
old
(18)
4.3. Feature reduction phase
The output of the WLD algorithm is a high dimensional feature
vector, which increases the computational cost for the classification
phase. To address these issues, we have applied the LDA algorithm,
described in Section 3.3, to reduce the number of features. Moreover,
the LDA was used to search for the LDA space which increases the se-
parability between different classes as in Fig. 4. In the training stage,
LDA was applied on the training data, i.e., feature matrix, to find the
LDA space. In the testing stage, the feature vector of the unknown
Object (360o
rotating)
M
i
n
i
m
u
m
O
b
s
e
r
v
a
t
i
o
n
D
i
s
t
a
n
c
e
M
a
x
i
m
u
m
O
b
s
e
r
v
a
t
i
o
n
D
i
s
t
a
n
c
e
Camera
L
i
g
h
t
Fig. 2. The proposed fish imaging data collection system.
Table 1
Dataset's descriptions.
Class Number of samples Minimum dimensions Maximum dimensions
Class 1 60 (536 × 2260) (1108 × 3264)
Class 2 53 (752 × 2371) (1221 × 3264)
Class 3 58 (577 × 2500) (1317 × 3264)
Class 4 70 (893 × 2512) (1665 × 3264)
2
16MP X-Trans CMOS II APS-C sensor EXR Processor II, ISO 200-6400, plus 100 –
51200 expanded, 2.36M dot OLED electronic viewfinder with 0.62× (equiv.) magnifi-
cation, 3 in. 920k dot tilting LCD.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
328
6. sample was projected on the LDA space to reduce its dimensionality.
It is worth mentioning that the standard LDA algorithm suffers from
the Small Sample Size (SSS) problem. This problem occurs when the
dimension of the data is much higher than the number of samples and
hence the SW becomes singular. There are many algorithms solved this
problem such as Regularized LDA (RLDA) (Lu et al., 2003), PCA + LDA
(Yu and Yang, 2001), and Direct LDA (DLDA) (Yu and Yang, 2001). In
our proposed model, due to the high dimensionality of WLD, the DLDA
algorithm was used for solving the SSS problem.
4.4. Classification phase
During the classification phase, the proposed system gives a pre-
diction or class label for the unknown sample. In the proposed ap-
proach, the AdaBoost classifier was used for classification. In the
training stage, the AdaBoost classifier used the extracted features from
Fig. 4. Block diagram of the proposed approach.
Fig. 3. Samples from our collected dataset.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
329
7. the training data to train the classification model. This model will be
used to match the features of the unknown sample with the features of
the training data. In our model, the weak learners of the AdaBoost
classifier was the decision tree classifier.
5. Experimental results
5.1. Experimental setup
The proposed fish classification model was evaluated using 241
color images collected from four different fish species that mentioned in
Section 4.1.
All experiments were performed on a PC with the following fea-
tures: Intel(R) Core (TM) i5-2400 CPU@
3.10 GHz, 4 GB RAM, Windows
7 operating system, and Matlab 7.10.0 (R2010a).
Different experiments were conducted to test the influence of the
patch size of WLD on the computational time and accuracy of the
proposed model. From these experiments we found that increasing the
patch size decreases the length of feature vectors, consequently de-
creases the computational time for classification. Moreover, the best
accuracy of the proposed model was achieved when the patch size was
set to 5 × 5. Fig. 5 shows the histogram of the WLD method using
different patch sizes.
Due to the variation in images’ dimensions as mentioned in Section
4.1, applying WLD generates feature vectors with different lengths. In
our experiments, we scaled all images to be in the same scale
(500 × 2000). Additionally, due to the high dimensionality of the WLD
features which reached up to 98968 features, the LDA algorithm was
used to reduce this high dimensionality and further extracts more dis-
criminative features. Fig. 6 shows a scatterplot for the samples of the
four classes after using LDA to reduce the number of features to four
features. In this figure, the x-axis and y-axis represent the features of
WLD after applying LDA. This figure shows how the samples for the
four classes are easily separated and hence classified. Fig. 7 shows the
scatterplot for the color feature (six features) of all samples. As shown,
the separability between different classes in Fig. 6 is much higher than
in Fig. 7. This is because projecting WLD features onto the LDA space
increased the between-class variance and hence increased the separ-
ability between different classes. For this reason, the LDA was applied
on the color features. Fig. 8 shows the scatterplot for the color features
after applying LDA. As shown, the separability between classes
increased, and this may have a positive impact on the classification
accuracy.
5.2. Experimental scenarios
In this section, four experiments were conducted. The aim of the
first experiment (in Section 5.2.1) was to compare the proposed model
using WLD features, color features, and the WLD + color features with
and without the normalization step. The goal of the second experiment
(in Section 5.2.2) was to evaluate the proposed model, i.e., normalized
WLD + color features, using different numbers of single classifiers. The
third experiment (in Section 5.2.3) was conducted to compare the
AdaBoost classifier with three well-known classifiers. The fourth and
last experiment (Section 5.2.4) was conducted to show the influence of
the image rotation and image translation on the performance of the
proposed model.
In all experiments, k-fold cross-validation tests have been used,
where the original samples of the dataset were randomly partitioned
into k subsets of (approximately) equal size and the experiment was run
k times; for each run, one subset was used as a testing set and the other
k − 1 subsets were used as the training set. The average of the k results
from the folds can then be calculated to produce a single estimation. In
this study, the value of k was set to 10.
5.2.1. Color vs. WLD vs. WLD + color features
The aim of this experiment is to evaluate the proposed model using
different sets of features. This experiment consists of five sub-experi-
ments. In the first three sub-experiments, the color, color after applying
LDA (color + LDA), and WLD features were used in the proposed
model, respectively. In the fourth sub-experiment, the fusion of WLD
and color features after applying LDA was used. In the fifth sub-ex-
periment, the WLD and color features were first normalized using Zscore
normalization, and then the normalized features were combined. In all
sub-experiments, the AdaBoost classifier was used when the number of
single classifiers was 3, 7, 11, 15, and 19. Moreover, the patch size of
the WLD method was 5 × 5. Table 2 summarizes the results of this
experiment.
From Table 2 many conclusions can be drawn.
1. The proposed model achieved good results when the color features
were used. As shown, the accuracy rate ranged from 78.7% to
Fig. 5. WLD histogram using different patch sizes.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
330
8. 88.47%, and the accuracy increased when LDA was applied on the
color features. This reflects how LDA improved classification accu-
racy.
2. When using WLD features, the classification accuracy was lower
than those achieved using color features. This is because the color
features are more robust than WLD against image rotation, scaling,
and translation. Further, the robustness of the WLD features is af-
fected by image scaling than color features, and the image in our
dataset are in different scales.
3. The proposed model using WLD and color features without nor-
malization achieved results lower than the color features. The
reason for the lower results is that the features of WLD and the color
Fig. 7. Scatter plot of the color features for the four classes in our experiments.
Fig. 6. Scatter plot of the WLD features after applying LDA for the four classes (C1 is the Argyrosomus regius class, C2 is the Sardinella maderensis class, C3 is the Scomberomorus commerson
class, and C4 is the Trachinotus ovatus class).
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
331
9. features are incompatible. Thus, the two sets of features need to be
normalized before combining them.
4. The proposed model using WLD and color features with normal-
ization achieved the best results, ranging from 90.4% to 96.35%
classification accuracy.
5. The accuracy of all methods increased proportionally with the
number of single classifiers. To this end, the maximum accuracy was
achieved when the number of single classifiers was set to 11 or 15.
In the next experiment, the influence of the number of single clas-
sifiers will be explained in more detail.
To conclude, combining WLD and color features after normalizing
them improved the performance of the proposed model and yielded
results better than WLD and color features.
5.2.2. Number of single classifiers
The aim of this experiment is to evaluate the influence of the
number of single classifiers of the AdaBoost classifier on the proposed
model. In this experiment, the same sets of features that were utilized in
the first experiment (see Table 2) were used. The number of single
classifiers of AdaBoost was varied from 3 to 19. Tables 2 and 3 sum-
marize the results of this experiment using the testing and training data,
respectively.
Tables 2 and 3 reveal a number of interesting findings:
• Firstly, the higher the number of single classifiers combined in an
AdaBoost classifier, the better the classification accuracy achieved.
• Secondly, the accuracy using the training data was much higher
than the accuracy using the testing data, and this because the
training data was used to train the classification model. Moreover,
the training accuracy of the proposed model was increased till it
reached to an extent or maximum (as in our experiment) at which it
remained constant. Similarly, the testing accuracy was increased
when the ensemble size was increased till it reached an extent after
Table 2
The accuracy (in %) of the proposed model using color, color + LDA, WLD, WLD + color without normalization (A), WLD + color with normalization (B) (using the testing data).
# of single classifiers Color Color + LDA WLD WLD + Color (A) WLD + Color (B)
3 78.7 ± 6.6 86.5 ± 9.7 71.3 ± 10.5 80.3 ± 7.3 90.4 ± 8.0
7 86.2 ± 6.0 90.2 ± 7.2 73.7 ± 9.9 86.5 ± 7.5 92.5 ± 4.4
11 83.2 ± 5.6 93.6 ± 3.9 75.6 ± 8.8 87.4 ± 8.5 92.5 ± 4.8
15 87.2 ± 8.0 91.6 ± 4.5 76.8 ± 5.6 86.9 ± 8.8 96.4 ± 5.4
19 88.5 ± 9.5 90.4 ± 6.3 76.2 ± 9.8 87.2 ± 9.6 95.2 ± 3.8
Table 3
The results of the proposed model using color, color + LDA, WLD, WLD + color without normalization (A), WLD + color with normalization (B) (using the training data).
# of single classifiers Color Color + LDA WLD WLD + Color (A) WLD + Color (B)
3 98.2 ± 0.8 99.7 ± 0.4 77.2 ± 5.5 96.9 ± 1.6 99.9 ± 0.2
7 100.0 ± 0.0 100.0 ± 0.0 96.4 ± 1.9 100.0 ± 0.0 100.0 ± 0.0
11 100.0 ± 0.0 100.0 ± 0.0 98.5 ± 0.5 100.0 ± 0.0 100.0 ± 0.0
15 100.0 ± 0.0 100.0 ± 0.0 99.4 ± 0.3 100.0 ± 0.0 100.0 ± 0.0
19 100.0 ± 0.0 100.0 ± 0.0 100.0 ± 0.0 100.0 ± 0.0 100.0 ± 0.0
Fig. 8. Scatter plot of the color features after applying LDA for the four classes in our experiments.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
332
10. which it reduced again. This is because increasing the number of
single classifiers may increase the complexity of the AdaBoost
classifier and this may lead to the over-fitting problem.
Fig. 9 presents the computational time of the proposed model using
different numbers of single classifiers. As shown in this figure, in-
creasing the number of single classifiers in the AdaBoost classifier in-
creases the computational time.
To conclude, increasing the number of single classifiers requires
more computational time and increases the training accuracy, but it
may lead to overfitting. The best accuracy was achieved when 15 single
classifiers were used.
5.2.3. AdaBoost vs. conventional classifiers
The aim of this experiment is to test the proposed model using
different classifiers. In this experiment, the AdaBoost classifier was
compared with Naive Bayesian (NB) (Bender et al., 2004), k-Nearest
Neighbor (k-NN) (Tharwat et al., 2013), and Multi-Layer Perceptron
(MLP) (Gardner and Dorling, 1998) classifiers. In the AdaBoost classi-
fier, 15 single classifiers were used, and the value of k in k-NN was
three. In MLP, a hidden layer with 15 nodes and 1000 epochs were
used. In this experiment, the WLD+color features with normalization
were used. Fig. 10 displays the results of this experiment.
As shown in Fig. 10, the AdaBoost classifier obtained the best results
compared to the other classifiers. This is because the AdaBoost classifier
depends on many classifiers not only single classifier as in k-NN, MLP,
and NB classifiers. Hence, the decisions or class labels of the AdaBoost
classifier are generated based on combining different decisions from
different weak learners which gives an advantage over the other single
classifiers. Also, AdaBoost focuses on critical points or samples as
mentioned in Section 3.4 which increases the classification perfor-
mance. In addition, the MLP classifier achieved the second best results,
while the k-NN achieved the worst results.
For further evaluation, a comparison between our approach and the
most related work was conducted, as illustrated in Table 4. As shown in
the table, there is a clear difference between the number of samples in
all datasets, and the number of samples was ranged from 22 samples to
1346 samples. Further, the number of classes is different. Hence, it is
not fair to compare our results with the related work that listed in
Table 4. However, it can be remarked that the proposed approach
Fig. 9. Computational time of the proposed model using different numbers of single classifiers.
Fig. 10. Accuracy of the proposed model using AdaBoost, NB, k-NN, and MLP classifiers.
Table 4
A comparison between our proposed fish identification model and some of state-of-the-art models in terms of, accuracy, feature extraction methods, and classification algorithms.
Reference Feature extraction method Classification method Results Dataset
Cadieux et al. (2000) Moment invariants + Fourier descriptors + nine shape features NN 70.8–72.7% 1200 images (5 classes)
Chambah et al. (2003) Geometric + color + motion + texture features Quadratic Bayes 85.77% 1346 images (12 classes)
Lee et al. (2004) Template matching Shape matching 60% 22 images (9 classes)
Rova et al. (2007) Deformable template matching SVM 90% 320 images (2 classes)
Proposed approach WLD + color AdaBoost 96.4% 241 images (4 classes)
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
333
11. yielded significant results.
For further testing our proposed model, the Receiver Operating
Characteristics (ROC) curve and confusion matrix are given in Figs. 11
and 12, respectively. As shown in Fig. 11, the four classes achieved high
results which reflect the robustness of the proposed model. These re-
sults are in agreement with the results in Fig. 12, where the four classes
achieved 96.3% accuracy, and the error rate was small. Moreover, the
sensitivity or true positive rate values of the four classes (the bottom
row) were ranged from 95.7 to 96.7% which reflects the robustness of
our proposed model. Additionally, the right column represents the
precision or positive predictive value values were more than 93%.
5.2.4. Proposed model vs. image challenges
In this section, the proposed model was tested against different
image transformations such as rotation and translation. In this experi-
ment, the number of single classifiers was 11 and 15. The details of this
experiment are as follows:
• Image rotation: In this sub-experiment, the influence of image ro-
tation on the proposed model was tested. To do that, 50% of the
images were randomly rotated with different angles between 5∘
and
30∘
. Samples from images after rotation are illustrated in Fig. 13.
The results of this sub-experiment are reported in Fig. 15.
• Image translation: In this sub-experiment, the images were shifted
or translated by a specific number of pixels in both x and y direc-
tions. The translation in this sub-experiment was 10% of the original
image size. Fig. 14 displays samples of the translated images. As
shown, the effect of the translation is much similar to the occlusion,
which is one of the problems in biometrics. The results of this sub-
experiment are shown in Fig. 15.
Fig. 15 shows the results of the proposed model when the images
were rotated or translated. From the figure the following remarks can
be drawn:
1. The performance of the proposed model was slightly affected by the
image rotation or translation. This is because the proposed model
depends on the color and WLD features, and the color features are
robust against rotation and translation. Moreover, the orientation
component of the WLD algorithm makes the WLD robust against
rotation as reported in Chen et al. (2010).
Fig. 12. Confusion matrix of indicating the number of correctly classified sets for each
fish species and misclassification errors made by the proposed model.
Fig. 13. Samples of the images after rotation.
Fig. 11. ROC of the proposed model.
Fig. 14. Samples of the images after translation.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
334
12. 2. The image translation has a negative influence on the results of the
proposed model higher than the image rotation. This is due to the
image translation removes or neglects some parts of the image
which may have discriminative information.
However, in rotation and translation cases, the accuracy of the
proposed model was higher than 86%, which is competitive result in
nature (uncontrollable environment) of animals during the identifica-
tion process in real-time scenarios.
6. Conclusion and future work
In this paper, a new approach for fish identification was proposed.
This approach used the fusion of Weber's Local Descriptor (WLD) fea-
tures and color features. It also used the LDA algorithm to reduce the
dimensions of feature vectors and to increase the discrimination be-
tween different classes (fish species). The AdaBoost classifier was used
for classification. The experimental results proved that the proposed
model is robust when the WLD and color features were normalized and
combined. Moreover, different numbers of single classifiers were tested,
and the best accuracy achieved when the number of single classifiers
was 15. The AdaBoost classifier was compared with different well-
known classifiers, and the AdaBoost classifier achieved the best results.
In addition, our proposed approach obtained competitive results.
Furthermore, the proposed model revealed competitive results when
the images were rotated or translated which reflects the robustness of
the proposed model. In future work, our approach will be evaluated
against a larger database of fish images. Moreover, our model will be
extended to estimate fish size, weight, and age which is important for
stock assessment and management.
Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict
of interest.
Data availability The datasets created during and/or analyzed
during the current study are available from the corresponding author
upon reasonable request.
Human and animal rights All applicable international, national,
and/or institutional guidelines for the care and use of animals were
followed.
References
Ahonen, T., Hadid, A., Pietikainen, M., 2006. Face description with local binary patterns:
application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28 (12),
2037–2041.
Baudat, G., Anouar, F., 2000. Generalized discriminant analysis using a kernel approach.
J. Neural Comput. 12 (10), 2385–2404.
Bender, A., Mussa, H.Y., Glen, R.C., Reiling, S., 2004. Molecular similarity searching using
atom environments, information-based feature selection, and a naive Bayesian clas-
sifier. J. Chem. Inf. Comput. Sci. 44 (1), 170–178.
Benson, B., Cho, J., Goshorn, D., Kastner, R., 2009. Field Programmable Gate Array
(FPGA) Based Fish Detection Using Haar Classifiers. American Academy of
Underwater Sciences.
Cadieux, S., Michaud, F., Lalonde, F., 2000. Intelligent system for automated fish sorting
and counting. In: Proceedings of International Conference on Intelligent Robots and
Systems (IROS 2000), vol. 2. IEEE. pp. 1279–1284.
Chambah, M., Semani, D., Renouf, A., Courtellemont, P., Rizzi, A., 2003. Underwater
color constancy: enhancement of automatic live fish recognition. In: Electronic
Imaging 2004. International Society for Optics and Photonics. pp. 157–168.
Chen, J., Shan, S., He, C., Zhao, G., Pietikainen, M., Chen, X., Gao, W., 2010. WLD: a
robust local image descriptor. IEEE Trans. Pattern Anal. Mach. Intell. 32 (9),
1705–1720.
Chen, J., Shan, S., Zhao, G., Chen, X., Gao, W., Pietikainen, M., 2008. A robust descriptor
based on Weber's law. In: IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2008. IEEE. pp. 1–7.
Commission, E., et al., 1986. Regulation 3703/85, Common Fisheries Policy, Community
Grading Rules, A Guidance Note to the Fishing Industry. Issued by the UK Fisheries
Departments.
Corkery, G., Gonzales-Barron, U.A., Butler, F., McDonnell, K., Ward, S., 2007. A pre-
liminary investigation on face recognition as a biometric identifier of sheep. Trans.
ASABE 50 (1), 313–320.
Gaber, T., Tharwat, A., Hassanien, A.E., Snasel, V., 2016. Biometric cattle identification
approach based on Weber's local descriptor and AdaBoost classifier. Comput.
Electron. Agric. 122, 55–66.
Gardner, M.W., Dorling, S., 1998. Artificial neural networks (the multilayer perceptron) –
a review of applications in the atmospheric sciences. Atmos. Environ. 32 (14),
2627–2636.
Gong, D., Li, S., Xiang, Y., 2011. Face recognition using the Weber local descriptor. In:
First Asian Conference on Pattern Recognition (ACPR). IEEE. pp. 589–592.
Gonzales Barron, U., Corkery, G., Barry, B., Butler, F., McDonnell, K., Ward, S., 2008.
Assessment of retinal recognition technology as a biometric identification. J. Comput.
Electron. Agric. 60 (2), 156–166.
Hasija, S., Buragohain, M.J., Indu, S., 2017. Fish species classification using graph em-
bedding discriminant analysis. In: International Conference on Machine Vision and
Information Technology (CMVIT). IEEE. pp. 81–86.
Hnin, T.T., Lynn, K.T., 2016. Fish classification based on robust features selection using
machine learning techniques. Genetic and Evolutionary Computing. Springer, pp.
237–245.
Hyvärinen, A., Oja, E., 2000. Independent component analysis: algorithms and applica-
tions. Neural Netw. 13 (4), 411–430.
Iscsmen, B., Kutlu, Y., Reyhaniye, A.N., Turan, C., 2014. Image analysis methods on fish
recognition. In: 2014 22nd Signal Processing and Communications Applications
Conference (SIU). IEEE. pp. 1411–1414.
Jain, A., Nandakumar, K., Ross, A., 2005. Score normalization in multimodal biometric
systems. Pattern Recognit. 38 (12), 2270–2285.
Jiménez-Gamero, I., Dorado, G., Muñoz-Serrano, A., Analla, M., Alonso-Moraga, A., 2006.
DNA microsatellites to ascertain pedigree-recorded information in a selecting nucleus
of Murciano-Granadina dairy goats. Small Rumin. Res. 65 (3), 266–273.
Kuncheva, L.I., 2014. Combining Pattern Classifiers: Methods and Algorithms, 2nd ed.
John Wiley & Sons.
Larsen, R., Olafsdottir, H., Ersbøll, B., 2009. Shape and texture based classification of fish
species. Image Anal. 745–749.
Lee, D.-J., Schoenberger, R.B., Shiozawa, D., Xu, X., Zhan, P., 2004. Contour matching for
a fish recognition and migration-monitoring system. Optics East. International
Society for Optics and Photonics, pp. 37–48.
Lowe, D.G., 1999. Object recognition from local scale-invariant features. In: The
Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2.
IEEE. pp. 1150–1157.
Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N., 2003. Regularized discriminant analysis
for the small sample size problem in face recognition. Pattern Recognit. Lett. 24 (16),
3079–3087.
Ohta, Y.-I., Kanade, T., Sakai, T., 1980. Color information for region segmentation.
Comput. Graph. Image Process. 13 (3), 222–241.
Peirce, J., Leigh, A., Kendrick, K., et al., 2001. Human face recognition in sheep: lack of
configurational coding and right hemisphere advantage. Behav. Process. 55 (1),
13–26.
Rova, A., Mori, G., Dill, L.M., 2007. One fish, two fish, butterfish, trumpeter: recognizing
fish in underwater video. Proceedings of the IAPR Conference on Machine Vision
Applications 404–407.
Rusk, C.P., Blomeke, C.R., Balschweid, M.A., Elliot, S., Baker, D., 2006. An evaluation of
retinal imaging technology for 4-h beef and sheep identification. J. Ext. 44 (5), 1–33.
Shafait, F., Mian, A., Shortis, M., Ghanem, B., Culverhouse, P.F., Edgington, D., Cline, D.,
Ravanbakhsh, M., Seager, J., Harvey, E.S., 2016. Fish identification from videos
captured in uncontrolled underwater environments. ICES J. Mar. Sci. Journal du
Conseil 73 (10), 2737–2746.
Spampinato, C., Giordano, D., Di Salvo, R., Chen-Burger, Y.-H.J., Fisher, R.B., Nadarajan,
G., 2010. Automatic fish classification for underwater species behavior under-
standing. In: Proceedings of the First ACM International Workshop on Analysis and
Retrieval of Tracked Events and Motion in Imagery Streams. ACM. pp. 45–50.
Tharwat, A., 2016a. Linear vs. quadratic discriminant analysis classifier: a tutorial. Int. J.
Appl. Pattern Recognit. 3 (2), 145–180.
Tharwat, A., 2016b. Principal component analysis – a tutorial. Int. J. Appl. Pattern
Fig. 15. Performance of the proposed model using the original images, rotated images,
and translated images (the results are rounded to the nearest integer).
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
335
13. Recognit. 3 (3), 197–240.
Tharwat, A., Gaber, T., Hassanien, A.E., Schaefer, G., Pan, J.-S., 2016a. A fully-automated
zebra animal identification approach based on sift features. In: International
Conference on Genetic and Evolutionary Computing. Springer. pp. 289–297.
Tharwat, A., Mahdi, H., Hassanien, A.E., 2016b. Plant recommender system based on
multi-label classification. In: International Conference on Advanced Intelligent
Systems and Informatics. Springer. pp. 825–835.
Tharwat, A., Gaber, T., Ibrahim, A., Hassanien, A.E., 2017. Linear discriminant analysis: a
detailed tutorial. AI Commun. 1–22 (Preprint).
Tharwat, A., Ghanem, A.M., Hassanien, A.E., 2013. Three different classifiers for facial
age estimation based on k-nearest neighbor. In: Proceedings of the 9th International
Computer Engineering Conference (ICENCO). IEEE. pp. 55–60.
Yu, H., Yang, J., 2001. A direct LDA algorithm for high-dimensional data-with application
to face recognition. Pattern Recognit. 34 (10), 2067–2070.
A. Tharwat et al. Fisheries Research 204 (2018) 324–336
336