This document summarizes research on automatically classifying frog calls using wireless sensor networks and machine learning techniques. The researchers extracted features like MFCCs and wavelet coefficients from frog vocalizations and used k-NN and genetic algorithms to select an optimal feature subset and classify four frog species. Their results showed MFCCs achieved higher classification accuracy compared to wavelet features and that 8 MFCCs provided an optimized tradeoff between performance and computational cost for use on wireless sensor nodes.
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks
1. Juan G. Colonna
Afonso D. Ribas
Eduardo F. Nakamura
Eulanda M. dos Santos
Feature Subset Selection for
Automatically Classifying Anuran
Calls Using Sensor Networks
Institute of Computing (IComp)
Federal University of Amazon (UFAM)
2. Introduction - Environmental Motivation
The study of environmental conditions allow:
maintain the quality of life, and
to preserve the species.
The loss of species is an irreversible process!
The variation of species populations
enables to:
identify environmental problems in the
early stages, and
establish strategies for the
conservation of biological diversity.
3. Introduction - Environmental Motivation
Variations in amphibian populations are related to pollution,
deforestation, urbanization, etc.
Frogs can be used as indicators for detecting environmental stress.
Figure: Percentage of threatened
species in the red list. Figure adapted
from [Stuart et al., 2004].
4. Introduction – Objectives
Classify frog species of tropical forests based on the vocalizations
using wireless sensor networks and machine learning technique. *
* Consideration: Restrictions on the hardware.
4
5. Introduction - Challenges
Develop a method that does not need human intervention.
Characterize the spectral frequency of frog.
Extract and select the optimal set of features.
Define the classification technique.
Get the minimum set of features using genetic algorithm.
Obtain the cost of processing characteristics.
WSN and
Correlate the processing cost and success rate. Machine Learning
Maximize the benefit cost rate.
5
6. Related Work
Author Animal Features Classifier Results WSN
Taylor et al. [1996] Bufo marinus Spectrograma C4.5 60% No
Hu et al. [2005] Bufo marinus Spectrograma C4.5 60% Yes
Yen & Fu [2002]* 4 frog Wavelet MLP 71% No
Fisher’s
Clemins [2005] elephant MFCCs HMM 69% No
PLP DTW 73%
Cai et al. [2007] 14 bird MFCCs ANN 81% - 86% Yes
Huang et al. [2009]* 5 frog S - B - ZC k-NN 83% - 100% No
SVM 82% - 100%
Vaca-Castaño & 10 bird MFCCs k-NN 86% Yes
Rodriguez [2010]* 20 frog PCA 91%
Han et al. [2011]* 9 frog S - Hs - Hr k-NN 83% - 100% No
* Work implemented and used in the comparisons.
6
10. Spectrogram
Figure: Audio sample (wave form and spectrogram) for
the Adenomera andreae..
10
11. Features
Feature Complexity order Computational cost
Pitch O(L) 3L − 1
B O(Nlog(N)) 2M + 2M + Nlog(N)
12 MFCC’s O(Nlog(N)) Nlog(N) + N + mR
S O(Nlog(N)) 2M + Nlog(N)
H1 O(L) L+i
H2 O(L) L+i
ZC O(L) L
E O(L) L
Pw O(L) L
11
12. Comparison between MFCCs and Wavelet
Features k-NN
0.4 0.5 0.6
Wavelet Features 96.35%(3) 97.86%(1) 98.22%(1)
Daubechies Transform
Wavelet Features 96.70%(1) 97.90%(1) 98.38%(1)
Haar Transform
MFCCs 99.19%(9) 99.36%(2) 99.19%(1)
Table: Success rate in relation to alpha, using cross-validation fold = 10.
Applying the Wilcoxon test, with 95% significance level (α = 0.5), we conclude that the
MFCCs have better performance.
12
13. Comparison between MFCCs and Wavelet
Objective: To determine the optimal subset of features by applying GA.
13
14. Comparison between MFCCs and Wavelet
Features Classification Crossover 50% Success rate Crossover 60% Success rate
before GA Mutation 40% Mutation 20%
9 features 97.86%(1) 1,2,3,5 93.73% 1,2,3,4,5,6,8,9 96.83%
with Db
9 features 97.90%(1)* 2,3,4,5,6,8,9 96.47% 1,2,3,4,5,6,7,8,9 97.90%*
with Haar
12 MFCCs 99.36%(2)* 1,2,3,4,5,6,7,11 99.08% 1,2,3,4,5,6,7,8,91 99.33%*
1,12
14
16. Conclusions
We indicated how best set of features to choose the 12 MFCCs.
You can optimize costs by using 8 MFCCs, although the method loses
generality.
The MFFCs have:
✔
Better success rate;
✔
Constant cost, regardless of hardware, and
✔
Immunity to environmental and quantization noise.
16