This document summarizes kernel-based speaker recognition techniques for an automatic speaker recognition system (ASR) in iTaukei cross-language speech recognition. It discusses kernel principal component analysis (KPCA), kernel independent component analysis (KICA), and kernel linear discriminant analysis (KLDA) for nonlinear speaker-specific feature extraction to improve ASR classification rates. Evaluation of the ASR system using these techniques on a Japanese language corpus and self-recorded iTaukei corpus showed that KLDA achieved the best performance, with an equal error rate improvement of up to 8.51% compared to KPCA and KICA.
The aim of this research is to find accurate solution for the Troesch’s problem by using high performance technique based on parallel processing implementation.
Design/methodology/approach – Feed forward neural network is designed to solve important type of differential equations that arises in many applied sciences and engineering applications. The suitable designed based on choosing suitable learning rate, transfer function, and training algorithm. The authors used back propagation with new implement of Levenberg - Marquardt training algorithm. Also, the authors depend new idea for choosing the weights. The effectiveness of the suggested design for the network is shown by using it for solving Troesch problem in many cases.
Findings – New idea for choosing the weights of the neural network, new implement of Levenberg - Marquardt training algorithm which assist to speeding the convergence and the implementation of the suggested design demonstrates the usefulness in finding exact solutions.
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Fast and high-quality clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information. Recent studies have shown that partitional clustering algorithms such as the k-means algorithm are the most popular algorithms for clustering large datasets. The major problem with partitional clustering algorithms is that they are sensitive to the selection of the initial partitions and are prone to premature converge to local optima. Subtractive clustering is a fast, one-pass algorithm for estimating the number of clusters and cluster centers for any given set of data. The cluster estimates can be used to initialize iterative optimization-based clustering methods and model identification methods. In this paper, we present a hybrid Particle Swarm Optimization, Subtractive + (PSO) clustering algorithm that performs fast clustering. For comparison purpose, we applied the Subtractive + (PSO) clustering algorithm, PSO, and the Subtractive clustering algorithms on three different datasets. The results illustrate that the Subtractive + (PSO) clustering algorithm can generate the most compact clustering results as compared to other algorithms.
JAVA BASED VISUALIZATION AND ANIMATION FOR TEACHING THE DIJKSTRA SHORTEST PAT...ijseajournal
Shortest path (SP) algorithms, such as the popular Dijkstra algorithm has been considered as the “basic
building blocks” for many advanced transportation network models. Dijkstra algorithm will find the
shortest time (ST) and the corresponding SP to travel from a source node to a destination node.
Applications of SP algorithms include real-time GPS and the Frank-Wolfe network equilibrium.
For transportation engineering students, the Dijkstra algorithm is not easily understood. This paper
discusses the design and development of a software that will help the students to fully understand the key
components involved in the Dijkstra SP algorithm. The software presents an intuitive interface for
generating transportation network nodes/links, and how the SP can be updated in each iteration. The
software provides multiple visual representations of colour mapping and tabular display. The software can
be executed in each single step or in continuous run, making it easy for students to understand the Dijkstra
algorithm. Voice narratives in different languages (English, Chinese and Spanish) are available.A demo
video of the Dijkstra Algorithm’s animation and result can be viewed online from any web browser using
the website: http://www.lions.odu.edu/~imako001/dijkstra/demo/index.html.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
The aim of this research is to find accurate solution for the Troesch’s problem by using high performance technique based on parallel processing implementation.
Design/methodology/approach – Feed forward neural network is designed to solve important type of differential equations that arises in many applied sciences and engineering applications. The suitable designed based on choosing suitable learning rate, transfer function, and training algorithm. The authors used back propagation with new implement of Levenberg - Marquardt training algorithm. Also, the authors depend new idea for choosing the weights. The effectiveness of the suggested design for the network is shown by using it for solving Troesch problem in many cases.
Findings – New idea for choosing the weights of the neural network, new implement of Levenberg - Marquardt training algorithm which assist to speeding the convergence and the implementation of the suggested design demonstrates the usefulness in finding exact solutions.
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
There is a tremendous proliferation in the amount of information available on the largest shared information source, the World Wide Web. Fast and high-quality clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the information. Recent studies have shown that partitional clustering algorithms such as the k-means algorithm are the most popular algorithms for clustering large datasets. The major problem with partitional clustering algorithms is that they are sensitive to the selection of the initial partitions and are prone to premature converge to local optima. Subtractive clustering is a fast, one-pass algorithm for estimating the number of clusters and cluster centers for any given set of data. The cluster estimates can be used to initialize iterative optimization-based clustering methods and model identification methods. In this paper, we present a hybrid Particle Swarm Optimization, Subtractive + (PSO) clustering algorithm that performs fast clustering. For comparison purpose, we applied the Subtractive + (PSO) clustering algorithm, PSO, and the Subtractive clustering algorithms on three different datasets. The results illustrate that the Subtractive + (PSO) clustering algorithm can generate the most compact clustering results as compared to other algorithms.
JAVA BASED VISUALIZATION AND ANIMATION FOR TEACHING THE DIJKSTRA SHORTEST PAT...ijseajournal
Shortest path (SP) algorithms, such as the popular Dijkstra algorithm has been considered as the “basic
building blocks” for many advanced transportation network models. Dijkstra algorithm will find the
shortest time (ST) and the corresponding SP to travel from a source node to a destination node.
Applications of SP algorithms include real-time GPS and the Frank-Wolfe network equilibrium.
For transportation engineering students, the Dijkstra algorithm is not easily understood. This paper
discusses the design and development of a software that will help the students to fully understand the key
components involved in the Dijkstra SP algorithm. The software presents an intuitive interface for
generating transportation network nodes/links, and how the SP can be updated in each iteration. The
software provides multiple visual representations of colour mapping and tabular display. The software can
be executed in each single step or in continuous run, making it easy for students to understand the Dijkstra
algorithm. Voice narratives in different languages (English, Chinese and Spanish) are available.A demo
video of the Dijkstra Algorithm’s animation and result can be viewed online from any web browser using
the website: http://www.lions.odu.edu/~imako001/dijkstra/demo/index.html.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
Single Channel Speech De-noising Using Kernel Independent Component Analysis...theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute.
These patterns are then utilized to predict the values of the target attribute in future data instances.
Unsupervised learning: The data have no target attribute.
We want to explore the data to find some intrinsic structures in them.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
Face recognition is one of the most unobtrusive biometric techniques that can be used for access control as well as surveillance purposes. Various methods for implementing face recognition have been proposed with varying degrees of performance in different scenarios. The most common issue with effective facial biometric systems is high susceptibility of variations in the face owing to different factors like changes in pose, varying illumination, different expression, presence of outliers, noise etc. This paper explores a novel technique for face recognition by performing classification of the face images using unsupervised learning approach through K-Medoids clustering. Partitioning Around Medoids algorithm (PAM) has been used for performing K-Medoids clustering of the data. The results are suggestive of increased robustness to noise and outliers in comparison to other clustering methods. Therefore the technique can also be used to increase the overall robustness of a face recognition system and thereby increase its invariance and make it a reliably usable biometric modality
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)Parinda Rajapaksha
Branch & Bound and Beam search algorithms were illustrated according to the feature selection domain. Presentation is structured as follows,
- Motivation
- Introduction
- Analysis
- Algorithm
- Pseudo Code
- Illustration of examples
- Applications
- Observations and Recommendations
- Comparison between two algorithms
- References
Self Organizing Maps: Fundamentals.
Introduction to Neural Networks.
1. What is a Self Organizing Map?
2. Topographic Maps
3. Setting up a Self Organizing Map
4. Kohonen Networks
5. Components of Self Organization
6. Overview of the SOM Algorithm
Black-box modeling of nonlinear system using evolutionary neural NARX modelIJECEIAES
Nonlinear systems with uncertainty and disturbance are very difficult to model using mathematic approach. Therefore, a black-box modeling approach without any prior knowledge is necessary. There are some modeling approaches have been used to develop a black box model such as fuzzy logic, neural network, and evolution algorithms. In this paper, an evolutionary neural network by combining a neural network and a modified differential evolution algorithm is applied to model a nonlinear system. The feasibility and effectiveness of the proposed modeling are tested on a piezoelectric actuator SISO system and an experimental quadruple tank MIMO system.
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
Quantum clustering (QC), is a data clustering algorithm based on quantum mechanics which is
accomplished by substituting each point in a given dataset with a Gaussian. The width of the Gaussian is a
σ value, a hyper-parameter which can be manually defined and manipulated to suit the application.
Numerical methods are used to find all the minima of the quantum potential as they correspond to cluster
centers. Herein, we investigate the mathematical task of expressing and finding all the roots of the
exponential polynomial corresponding to the minima of a two-dimensional quantum potential. This is an
outstanding task because normally such expressions are impossible to solve analytically. However, we
prove that if the points are all included in a square region of size σ, there is only one minimum. This bound
is not only useful in the number of solutions to look for, by numerical means, it allows to to propose a new
numerical approach “per block”. This technique decreases the number of particles by approximating some
groups of particles to weighted particles. These findings are not only useful to the quantum clustering
problem but also for the exponential polynomials encountered in quantum chemistry, Solid-state Physics
and other applications.
Background Estimation Using Principal Component Analysis Based on Limited Mem...IJECEIAES
Given a video of 푀 frames of size ℎ × 푤. Background components of a video are the elements matrix which relative constant over 푀 frames. In PCA (principal component analysis) method these elements are referred as “principal components”. In video processing, background subtraction means excision of background component from the video. PCA method is used to get the background component. This method transforms 3 dimensions video (ℎ × 푤 × 푀) into 2 dimensions one (푁 × 푀), where 푁 is a linear array of size ℎ × 푤 . The principal components are the dominant eigenvectors which are the basis of an eigenspace. The limited memory block Krylov subspace optimization then is proposed to improve performance the computation. Background estimation is obtained as the projection each input image (the first frame at each sequence image) onto space expanded principal component. The procedure was run for the standard dataset namely SBI (Scene Background Initialization) dataset consisting of 8 videos with interval resolution [146 150, 352 240], total frame [258,500]. The performances are shown with 8 metrics, especially (in average for 8 videos) percentage of error pixels (0.24%), the percentage of clustered error pixels (0.21%), multiscale structural similarity index (0.88 form maximum 1), and running time (61.68 seconds).
Single Channel Speech De-noising Using Kernel Independent Component Analysis...theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute.
These patterns are then utilized to predict the values of the target attribute in future data instances.
Unsupervised learning: The data have no target attribute.
We want to explore the data to find some intrinsic structures in them.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
Face recognition is one of the most unobtrusive biometric techniques that can be used for access control as well as surveillance purposes. Various methods for implementing face recognition have been proposed with varying degrees of performance in different scenarios. The most common issue with effective facial biometric systems is high susceptibility of variations in the face owing to different factors like changes in pose, varying illumination, different expression, presence of outliers, noise etc. This paper explores a novel technique for face recognition by performing classification of the face images using unsupervised learning approach through K-Medoids clustering. Partitioning Around Medoids algorithm (PAM) has been used for performing K-Medoids clustering of the data. The results are suggestive of increased robustness to noise and outliers in comparison to other clustering methods. Therefore the technique can also be used to increase the overall robustness of a face recognition system and thereby increase its invariance and make it a reliably usable biometric modality
Analysis of Feature Selection Algorithms (Branch & Bound and Beam search)Parinda Rajapaksha
Branch & Bound and Beam search algorithms were illustrated according to the feature selection domain. Presentation is structured as follows,
- Motivation
- Introduction
- Analysis
- Algorithm
- Pseudo Code
- Illustration of examples
- Applications
- Observations and Recommendations
- Comparison between two algorithms
- References
Self Organizing Maps: Fundamentals.
Introduction to Neural Networks.
1. What is a Self Organizing Map?
2. Topographic Maps
3. Setting up a Self Organizing Map
4. Kohonen Networks
5. Components of Self Organization
6. Overview of the SOM Algorithm
Black-box modeling of nonlinear system using evolutionary neural NARX modelIJECEIAES
Nonlinear systems with uncertainty and disturbance are very difficult to model using mathematic approach. Therefore, a black-box modeling approach without any prior knowledge is necessary. There are some modeling approaches have been used to develop a black box model such as fuzzy logic, neural network, and evolution algorithms. In this paper, an evolutionary neural network by combining a neural network and a modified differential evolution algorithm is applied to model a nonlinear system. The feasibility and effectiveness of the proposed modeling are tested on a piezoelectric actuator SISO system and an experimental quadruple tank MIMO system.
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
Quantum clustering (QC), is a data clustering algorithm based on quantum mechanics which is
accomplished by substituting each point in a given dataset with a Gaussian. The width of the Gaussian is a
σ value, a hyper-parameter which can be manually defined and manipulated to suit the application.
Numerical methods are used to find all the minima of the quantum potential as they correspond to cluster
centers. Herein, we investigate the mathematical task of expressing and finding all the roots of the
exponential polynomial corresponding to the minima of a two-dimensional quantum potential. This is an
outstanding task because normally such expressions are impossible to solve analytically. However, we
prove that if the points are all included in a square region of size σ, there is only one minimum. This bound
is not only useful in the number of solutions to look for, by numerical means, it allows to to propose a new
numerical approach “per block”. This technique decreases the number of particles by approximating some
groups of particles to weighted particles. These findings are not only useful to the quantum clustering
problem but also for the exponential polynomials encountered in quantum chemistry, Solid-state Physics
and other applications.
Background Estimation Using Principal Component Analysis Based on Limited Mem...IJECEIAES
Given a video of 푀 frames of size ℎ × 푤. Background components of a video are the elements matrix which relative constant over 푀 frames. In PCA (principal component analysis) method these elements are referred as “principal components”. In video processing, background subtraction means excision of background component from the video. PCA method is used to get the background component. This method transforms 3 dimensions video (ℎ × 푤 × 푀) into 2 dimensions one (푁 × 푀), where 푁 is a linear array of size ℎ × 푤 . The principal components are the dominant eigenvectors which are the basis of an eigenspace. The limited memory block Krylov subspace optimization then is proposed to improve performance the computation. Background estimation is obtained as the projection each input image (the first frame at each sequence image) onto space expanded principal component. The procedure was run for the standard dataset namely SBI (Scene Background Initialization) dataset consisting of 8 videos with interval resolution [146 150, 352 240], total frame [258,500]. The performances are shown with 8 metrics, especially (in average for 8 videos) percentage of error pixels (0.24%), the percentage of clustered error pixels (0.21%), multiscale structural similarity index (0.88 form maximum 1), and running time (61.68 seconds).
On the High Dimentional Information Processing in Quaternionic Domain and its...IJAAS Team
There are various high dimensional engineering and scientific applications in communication, control, robotics, computer vision, biometrics, etc.; where researchers are facing problem to design an intelligent and robust neural system which can process higher dimensional information efficiently. The conventional real-valued neural networks are tried to solve the problem associated with high dimensional parameters, but the required network structure possesses high complexity and are very time consuming and weak to noise. These networks are also not able to learn magnitude and phase values simultaneously in space. The quaternion is the number, which possesses the magnitude in all four directions and phase information is embedded within it. This paper presents a well generalized learning machine with a quaternionic domain neural network that can finely process magnitude and phase information of high dimension data without any hassle. The learning and generalization capability of the proposed learning machine is presented through a wide spectrum of simulations which demonstrate the significance of the work.
Heuristic based adaptive step size clms algorithms for smart antennascsandit
A smart antenna system combines multiple antenna elements with a signal processing capability to
optimize its radiation and/or reception pattern automatically in response to the signal environment through
complex weight selection. The weight selection process to get suitable Array factor with low Half Power
Beam Width (HPBW) and Side Lobe Level (SLL) is a complex method. The aim of this task is to design a
new approach for smart antennas to minimize the noise and interference effects from external sources with
least number of iterations. This paper presents Heuristics based adaptive step size Complex Least Mean
Square (CLMS) model for Smart Antennas to speedup convergence. In this process Benveniste and
Mathews algorithms are used as heuristics with CLMS and the improvement of performance of Smart
Antenna System in terms of convergence rate and array factor are discussed and compared with the
performance of CLMS and Augmented CLMS (ACLMS) algorithms.
HEURISTIC BASED ADAPTIVE STEP SIZE CLMS ALGORITHMS FOR SMART ANTENNAScscpconf
A smart antenna system combines multiple antenna elements with a signal processing capability to optimize its radiation and/or reception pattern automatically in response to the signal environment through complex weight selection. The weight selection process to get suitable Array factor with low Half Power Beam Width (HPBW) and Side Lobe Level (SLL) is a complex method. The aim of this task is to design a
new approach for smart antennas to minimize the noise and interference effects from external sources with least number of iterations. This paper presents Heuristics based adaptive step size Complex Least Mean Square (CLMS) model for Smart Antennas to speedup convergence. In this process Benveniste and Mathews algorithms are used as heuristics with CLMS and the improvement of performance of Smart Antenna System in terms of convergence rate and arr ay factor are discussed and compared with the performance of CLMS and Augmented CLMS (ACLMS) algorithms.
Clustering using kernel entropy principal component analysis and variable ker...IJECEIAES
Clustering as unsupervised learning method is the mission of dividing data objects into clusters with common characteristics. In the present paper, we introduce an enhanced technique of the existing EPCA data transformation method. Incorporating the kernel function into the EPCA, the input space can be mapped implicitly into a high-dimensional of feature space. Then, the Shannon’s entropy estimated via the inertia provided by the contribution of every mapped object in data is the key measure to determine the optimal extracted features space. Our proposed method performs very well the clustering algorithm of the fast search of clusters’ centers based on the local densities’ computing. Experimental results disclose that the approach is feasible and efficient on the performance query.
Fixed Point Realization of Iterative LR-Aided Soft MIMO Decoding AlgorithmCSCJournals
Multiple-input multiple-output (MIMO) systems have been widely acclaimed in order to provide high data rates. Recently Lattice Reduction (LR) aided detectors have been proposed to achieve near Maximum Likelihood (ML) performance with low complexity. In this paper, we develop the fixed point design of an iterative soft decision based LR-aided K-best decoder, which reduces the complexity of existing sphere decoder. A simulation based word-length optimization is presented for physical implementation of the K-best decoder. Simulations show that the fixed point result of 16 bit precision can keep bit error rate (BER) degradation within 0.3 dB for 8×8 MIMO systems with different modulation schemes.
Performance of Matching Algorithmsfor Signal Approximationiosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
Radial Basis Probabilistic Neural Network (RBPNN) has a broader generalized capability that been
successfully applied to multiple fields. In this paper, the Euclidean distance of each data point in RBPNN is
extended by calculating its kernel-induced distance instead of the conventional sum-of squares distance. The
kernel function is a generalization of the distance metric that measures the distance between two data points as the
data points are mapped into a high dimensional space. During the comparing of the four constructed classification
models with Kernel RBPNN, Radial Basis Function networks, RBPNN and Back-Propagation networks as
proposed, results showed that, model classification on Iris Data with Kernel RBPNN display an outstanding
performance in this regard
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
Radial Basis Probabilistic Neural Network (RBPNN) has a broader generalized capability that been successfully applied to multiple fields. In this paper, the Euclidean distance of each data point in RBPNN is extended by calculating its kernel-induced distance instead of the conventional sum-of squares distance. The kernel function is a generalization of the distance metric that measures the distance between two data points as the data points are mapped into a high dimensional space. During the comparing of the four constructed classification models with Kernel RBPNN, Radial Basis Function networks, RBPNN and Back-Propagation networks as proposed, results showed that, model classification on Iris Data with Kernel RBPNN display an outstanding performance in this regard.
Multimode system condition monitoring using sparsity reconstruction for quali...IJECEIAES
In this paper, we introduce an improved multivariate statistical monitoring method based on the stacked sparse autoencoder (SSAE). Our contribution focuses on the choice of the SSAE model based on neural networks to solve diagnostic problems of complex systems. In order to monitor the process performance, the squared prediction error (SPE) chart is linked with nonparametric adaptive confidence bounds which arise from the kernel density estimation to minimize erroneous alerts. Then, faults are localized using two methods: contribution plots and sensor validity index (SVI). The results are obtained from experiments and real data from a drinkable water processing plant, demonstrating how the applied technique is performed. The simulation results of the SSAE model show a better ability to detect and identify sensor failures.
Max stable set problem to found the initial centroids in clustering problemnooriasukmaningtyas
In this paper, we propose a new approach to solve the document-clustering using the K-Means algorithm. The latter is sensitive to the random selection of the k cluster centroids in the initialization phase. To evaluate the quality of K-Means clustering we propose to model the text document clustering problem as the max stable set problem (MSSP) and use continuous Hopfield network to solve the MSSP problem to have initial centroids. The idea is inspired by the fact that MSSP and clustering share the same principle, MSSP consists to find the largest set of nodes completely disconnected in a graph, and in clustering, all objects are divided into disjoint clusters. Simulation results demonstrate that the proposed K-Means improved by MSSP (KM_MSSP) is efficient of large data sets, is much optimized in terms of time, and provides better quality of clustering than other methods.
COMPARATIVE PERFORMANCE ANALYSIS OF RNSC AND MCL ALGORITHMS ON POWER-LAW DIST...acijjournal
Cluster analysis of graph related problems is an important issue now-a-day. Different types of graph
clustering techniques are appeared in the field but most of them are vulnerable in terms of effectiveness
and fragmentation of output in case of real-world applications in diverse systems. In this paper, we will
provide a comparative behavioural analysis of RNSC (Restricted Neighbourhood Search Clustering) and
MCL (Markov Clustering) algorithms on Power-Law Distribution graphs. RNSC is a graph clustering
technique using stochastic local search. RNSC algorithm tries to achieve optimal cost clustering by
assigning some cost functions to the set of clusterings of a graph. This algorithm was implemented by A.
D. King only for undirected and unweighted random graphs. Another popular graph clustering
algorithm MCL is based on stochastic flow simulation model for weighted graphs. There are plentiful
applications of power-law or scale-free graphs in nature and society. Scale-free topology is stochastic i.e.
nodes are connected in a random manner. Complex network topologies like World Wide Web, the web of
human sexual contacts, or the chemical network of a cell etc., are basically following power-law
distribution to represent different real-life systems. This paper uses real large-scale power-law
distribution graphs to conduct the performance analysis of RNSC behaviour compared with Markov
clustering (MCL) algorithm. Extensive experimental results on several synthetic and real power-law
distribution datasets reveal the effectiveness of our approach to comparative performance measure of
these algorithms on the basis of cost of clustering, cluster size, modularity index of clustering results and
normalized mutual information (NMI).
On Selection of Periodic Kernels Parameters in Time Series Prediction cscpconf
In the paper the analysis of the periodic kernels parameters is described. Periodic kernels can
be used for the prediction task, performed as the typical regression problem. On the basis of the
Periodic Kernel Estimator (PerKE) the prediction of real time series is performed. As periodic
kernels require the setting of their parameters it is necessary to analyse their influence on the
prediction quality. This paper describes an easy methodology of finding values of parameters of
periodic kernels. It is based on grid search. Two different error measures are taken into
consideration as the prediction qualities but lead to comparable results. The methodology was
tested on benchmark and real datasets and proved to give satisfactory results.
ON SELECTION OF PERIODIC KERNELS PARAMETERS IN TIME SERIES PREDICTIONcscpconf
In the paper the analysis of the periodic kernels parameters is described. Periodic kernels can
be used for the prediction task, performed as the typical regression problem. On the basis of the
Periodic Kernel Estimator (PerKE) the prediction of real time series is performed. As periodic
kernels require the setting of their parameters it is necessary to analyse their influence on the
prediction quality. This paper describes an easy methodology of finding values of parameters of
periodic kernels. It is based on grid search. Two different error measures are taken into
consideration as the prediction qualities but lead to comparable results. The methodology was
tested on benchmark and real datasets and proved to give satisfactory results.
In recent machine learning community, there is a trend of constructing a linear logarithm version of
nonlinear version through the ‘kernel method’ for example kernel principal component analysis, kernel
fisher discriminant analysis, support Vector Machines (SVMs), and the current kernel clustering
algorithms. Typically, in unsupervised methods of clustering algorithms utilizing kernel method, a
nonlinear mapping is operated initially in order to map the data into a much higher space feature, and then
clustering is executed. A hitch of these kernel clustering algorithms is that the clustering prototype resides
in increased features specs of dimensions and therefore lack intuitive and clear descriptions without
utilizing added approximation of projection from the specs to the data as executed in the literature
presented. This paper aims to utilize the ‘kernel method’, a novel clustering algorithm, founded on the
conventional fuzzy clustering algorithm (FCM) is anticipated and known as kernel fuzzy c-means algorithm
(KFCM). This method embraces a novel kernel-induced metric in the space of data in order to interchange
the novel Euclidean matric norm in cluster prototype and fuzzy clustering algorithm still reside in the space
of data so that the results of clustering could be interpreted and reformulated in the spaces which are
original. This property is used for clustering incomplete data. Execution on supposed data illustrate that
KFCM has improved performance of clustering and stout as compare to other transformations of FCM for
clustering incomplete data.
Similar to Kernal based speaker specific feature extraction and its applications in iTaukei cross language speaker recognition (20)
Amazon products reviews classification based on machine learning, deep learni...TELKOMNIKA JOURNAL
In recent times, the trend of online shopping through e-commerce stores and websites has grown to a huge extent. Whenever a product is purchased on an e-commerce platform, people leave their reviews about the product. These reviews are very helpful for the store owners and the product’s manufacturers for the betterment of their work process as well as product quality. An automated system is proposed in this work that operates on two datasets D1 and D2 obtained from Amazon. After certain preprocessing steps, N-gram and word embedding-based features are extracted using term frequency-inverse document frequency (TF-IDF), bag of words (BoW) and global vectors (GloVe), and Word2vec, respectively. Four machine learning (ML) models support vector machines (SVM), logistic regression (RF), logistic regression (LR), multinomial Naïve Bayes (MNB), two deep learning (DL) models convolutional neural network (CNN), long-short term memory (LSTM), and standalone bidirectional encoder representations (BERT) are used to classify reviews as either positive or negative. The results obtained by the standard ML, DL models and BERT are evaluated using certain performance evaluation measures. BERT turns out to be the best-performing model in the case of D1 with an accuracy of 90% on features derived by word embedding models while the CNN provides the best accuracy of 97% upon word embedding features in the case of D2. The proposed model shows better overall performance on D2 as compared to D1.
Design, simulation, and analysis of microstrip patch antenna for wireless app...TELKOMNIKA JOURNAL
In this study, a microstrip patch antenna that works at 3.6 GHz was built and tested to see how well it works. In this work, Rogers RT/Duroid 5880 has been used as the substrate material, with a dielectric permittivity of 2.2 and a thickness of 0.3451 mm; it serves as the base for the examined antenna. The computer simulation technology (CST) studio suite is utilized to show the recommended antenna design. The goal of this study was to get a more extensive transmission capacity, a lower voltage standing wave ratio (VSWR), and a lower return loss, but the main goal was to get a higher gain, directivity, and efficiency. After simulation, the return loss, gain, directivity, bandwidth, and efficiency of the supplied antenna are found to be -17.626 dB, 9.671 dBi, 9.924 dBi, 0.2 GHz, and 97.45%, respectively. Besides, the recreation uncovered that the transfer speed side-lobe level at phi was much better than those of the earlier works, at -28.8 dB, respectively. Thus, it makes a solid contender for remote innovation and more robust communication.
Design and simulation an optimal enhanced PI controller for congestion avoida...TELKOMNIKA JOURNAL
In this paper, snake optimization algorithm (SOA) is used to find the optimal gains of an enhanced controller for controlling congestion problem in computer networks. M-file and Simulink platform is adopted to evaluate the response of the active queue management (AQM) system, a comparison with two classical controllers is done, all tuned gains of controllers are obtained using SOA method and the fitness function chose to monitor the system performance is the integral time absolute error (ITAE). Transient analysis and robust analysis is used to show the proposed controller performance, two robustness tests are applied to the AQM system, one is done by varying the size of queue value in different period and the other test is done by changing the number of transmission control protocol (TCP) sessions with a value of ± 20% from its original value. The simulation results reflect a stable and robust behavior and best performance is appeared clearly to achieve the desired queue size without any noise or any transmission problems.
Improving the detection of intrusion in vehicular ad-hoc networks with modifi...TELKOMNIKA JOURNAL
Vehicular ad-hoc networks (VANETs) are wireless-equipped vehicles that form networks along the road. The security of this network has been a major challenge. The identity-based cryptosystem (IBC) previously used to secure the networks suffers from membership authentication security features. This paper focuses on improving the detection of intruders in VANETs with a modified identity-based cryptosystem (MIBC). The MIBC is developed using a non-singular elliptic curve with Lagrange interpolation. The public key of vehicles and roadside units on the network are derived from number plates and location identification numbers, respectively. Pseudo-identities are used to mask the real identity of users to preserve their privacy. The membership authentication mechanism ensures that only valid and authenticated members of the network are allowed to join the network. The performance of the MIBC is evaluated using intrusion detection ratio (IDR) and computation time (CT) and then validated with the existing IBC. The result obtained shows that the MIBC recorded an IDR of 99.3% against 94.3% obtained for the existing identity-based cryptosystem (EIBC) for 140 unregistered vehicles attempting to intrude on the network. The MIBC shows lower CT values of 1.17 ms against 1.70 ms for EIBC. The MIBC can be used to improve the security of VANETs.
Conceptual model of internet banking adoption with perceived risk and trust f...TELKOMNIKA JOURNAL
Understanding the primary factors of internet banking (IB) acceptance is critical for both banks and users; nevertheless, our knowledge of the role of users’ perceived risk and trust in IB adoption is limited. As a result, we develop a conceptual model by incorporating perceived risk and trust into the technology acceptance model (TAM) theory toward the IB. The proper research emphasized that the most essential component in explaining IB adoption behavior is behavioral intention to use IB adoption. TAM is helpful for figuring out how elements that affect IB adoption are connected to one another. According to previous literature on IB and the use of such technology in Iraq, one has to choose a theoretical foundation that may justify the acceptance of IB from the customer’s perspective. The conceptual model was therefore constructed using the TAM as a foundation. Furthermore, perceived risk and trust were added to the TAM dimensions as external factors. The key objective of this work was to extend the TAM to construct a conceptual model for IB adoption and to get sufficient theoretical support from the existing literature for the essential elements and their relationships in order to unearth new insights about factors responsible for IB adoption.
Efficient combined fuzzy logic and LMS algorithm for smart antennaTELKOMNIKA JOURNAL
The smart antennas are broadly used in wireless communication. The least mean square (LMS) algorithm is a procedure that is concerned in controlling the smart antenna pattern to accommodate specified requirements such as steering the beam toward the desired signal, in addition to placing the deep nulls in the direction of unwanted signals. The conventional LMS (C-LMS) has some drawbacks like slow convergence speed besides high steady state fluctuation error. To overcome these shortcomings, the present paper adopts an adaptive fuzzy control step size least mean square (FC-LMS) algorithm to adjust its step size. Computer simulation outcomes illustrate that the given model has fast convergence rate as well as low mean square error steady state.
Design and implementation of a LoRa-based system for warning of forest fireTELKOMNIKA JOURNAL
This paper presents the design and implementation of a forest fire monitoring and warning system based on long range (LoRa) technology, a novel ultra-low power consumption and long-range wireless communication technology for remote sensing applications. The proposed system includes a wireless sensor network that records environmental parameters such as temperature, humidity, wind speed, and carbon dioxide (CO2) concentration in the air, as well as taking infrared photos.The data collected at each sensor node will be transmitted to the gateway via LoRa wireless transmission. Data will be collected, processed, and uploaded to a cloud database at the gateway. An Android smartphone application that allows anyone to easily view the recorded data has been developed. When a fire is detected, the system will sound a siren and send a warning message to the responsible personnel, instructing them to take appropriate action. Experiments in Tram Chim Park, Vietnam, have been conducted to verify and evaluate the operation of the system.
Wavelet-based sensing technique in cognitive radio networkTELKOMNIKA JOURNAL
Cognitive radio is a smart radio that can change its transmitter parameter based on interaction with the environment in which it operates. The demand for frequency spectrum is growing due to a big data issue as many Internet of Things (IoT) devices are in the network. Based on previous research, most frequency spectrum was used, but some spectrums were not used, called spectrum hole. Energy detection is one of the spectrum sensing methods that has been frequently used since it is easy to use and does not require license users to have any prior signal understanding. But this technique is incapable of detecting at low signal-to-noise ratio (SNR) levels. Therefore, the wavelet-based sensing is proposed to overcome this issue and detect spectrum holes. The main objective of this work is to evaluate the performance of wavelet-based sensing and compare it with the energy detection technique. The findings show that the percentage of detection in wavelet-based sensing is 83% higher than energy detection performance. This result indicates that the wavelet-based sensing has higher precision in detection and the interference towards primary user can be decreased.
A novel compact dual-band bandstop filter with enhanced rejection bandsTELKOMNIKA JOURNAL
In this paper, we present the design of a new wide dual-band bandstop filter (DBBSF) using nonuniform transmission lines. The method used to design this filter is to replace conventional uniform transmission lines with nonuniform lines governed by a truncated Fourier series. Based on how impedances are profiled in the proposed DBBSF structure, the fractional bandwidths of the two 10 dB-down rejection bands are widened to 39.72% and 52.63%, respectively, and the physical size has been reduced compared to that of the filter with the uniform transmission lines. The results of the electromagnetic (EM) simulation support the obtained analytical response and show an improved frequency behavior.
Deep learning approach to DDoS attack with imbalanced data at the application...TELKOMNIKA JOURNAL
A distributed denial of service (DDoS) attack is where one or more computers attack or target a server computer, by flooding internet traffic to the server. As a result, the server cannot be accessed by legitimate users. A result of this attack causes enormous losses for a company because it can reduce the level of user trust, and reduce the company’s reputation to lose customers due to downtime. One of the services at the application layer that can be accessed by users is a web-based lightweight directory access protocol (LDAP) service that can provide safe and easy services to access directory applications. We used a deep learning approach to detect DDoS attacks on the CICDDoS 2019 dataset on a complex computer network at the application layer to get fast and accurate results for dealing with unbalanced data. Based on the results obtained, it is observed that DDoS attack detection using a deep learning approach on imbalanced data performs better when implemented using synthetic minority oversampling technique (SMOTE) method for binary classes. On the other hand, the proposed deep learning approach performs better for detecting DDoS attacks in multiclass when implemented using the adaptive synthetic (ADASYN) method.
The appearance of uncertainties and disturbances often effects the characteristics of either linear or nonlinear systems. Plus, the stabilization process may be deteriorated thus incurring a catastrophic effect to the system performance. As such, this manuscript addresses the concept of matching condition for the systems that are suffering from miss-match uncertainties and exogeneous disturbances. The perturbation towards the system at hand is assumed to be known and unbounded. To reach this outcome, uncertainties and their classifications are reviewed thoroughly. The structural matching condition is proposed and tabulated in the proposition 1. Two types of mathematical expressions are presented to distinguish the system with matched uncertainty and the system with miss-matched uncertainty. Lastly, two-dimensional numerical expressions are provided to practice the proposed proposition. The outcome shows that matching condition has the ability to change the system to a design-friendly model for asymptotic stabilization.
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...TELKOMNIKA JOURNAL
Many systems, including digital signal processors, finite impulse response (FIR) filters, application-specific integrated circuits, and microprocessors, use multipliers. The demand for low power multipliers is gradually rising day by day in the current technological trend. In this study, we describe a 4×4 Wallace multiplier based on a carry select adder (CSA) that uses less power and has a better power delay product than existing multipliers. HSPICE tool at 16 nm technology is used to simulate the results. In comparison to the traditional CSA-based multiplier, which has a power consumption of 1.7 µW and power delay product (PDP) of 57.3 fJ, the results demonstrate that the Wallace multiplier design employing CSA with first zero finding logic (FZF) logic has the lowest power consumption of 1.4 µW and PDP of 27.5 fJ.
Evaluation of the weighted-overlap add model with massive MIMO in a 5G systemTELKOMNIKA JOURNAL
The flaw in 5G orthogonal frequency division multiplexing (OFDM) becomes apparent in high-speed situations. Because the doppler effect causes frequency shifts, the orthogonality of OFDM subcarriers is broken, lowering both their bit error rate (BER) and throughput output. As part of this research, we use a novel design that combines massive multiple input multiple output (MIMO) and weighted overlap and add (WOLA) to improve the performance of 5G systems. To determine which design is superior, throughput and BER are calculated for both the proposed design and OFDM. The results of the improved system show a massive improvement in performance ver the conventional system and significant improvements with massive MIMO, including the best throughput and BER. When compared to conventional systems, the improved system has a throughput that is around 22% higher and the best performance in terms of BER, but it still has around 25% less error than OFDM.
Reflector antenna design in different frequencies using frequency selective s...TELKOMNIKA JOURNAL
In this study, it is aimed to obtain two different asymmetric radiation patterns obtained from antennas in the shape of the cross-section of a parabolic reflector (fan blade type antennas) and antennas with cosecant-square radiation characteristics at two different frequencies from a single antenna. For this purpose, firstly, a fan blade type antenna design will be made, and then the reflective surface of this antenna will be completed to the shape of the reflective surface of the antenna with the cosecant-square radiation characteristic with the frequency selective surface designed to provide the characteristics suitable for the purpose. The frequency selective surface designed and it provides the perfect transmission as possible at 4 GHz operating frequency, while it will act as a band-quenching filter for electromagnetic waves at 5 GHz operating frequency and will be a reflective surface. Thanks to this frequency selective surface to be used as a reflective surface in the antenna, a fan blade type radiation characteristic at 4 GHz operating frequency will be obtained, while a cosecant-square radiation characteristic at 5 GHz operating frequency will be obtained.
Reagentless iron detection in water based on unclad fiber optical sensorTELKOMNIKA JOURNAL
A simple and low-cost fiber based optical sensor for iron detection is demonstrated in this paper. The sensor head consist of an unclad optical fiber with the unclad length of 1 cm and it has a straight structure. Results obtained shows a linear relationship between the output light intensity and iron concentration, illustrating the functionality of this iron optical sensor. Based on the experimental results, the sensitivity and linearity are achieved at 0.0328/ppm and 0.9824 respectively at the wavelength of 690 nm. With the same wavelength, other performance parameters are also studied. Resolution and limit of detection (LOD) are found to be 0.3049 ppm and 0.0755 ppm correspondingly. This iron sensor is advantageous in that it does not require any reagent for detection, enabling it to be simpler and cost-effective in the implementation of the iron sensing.
Impact of CuS counter electrode calcination temperature on quantum dot sensit...TELKOMNIKA JOURNAL
In place of the commercial Pt electrode used in quantum sensitized solar cells, the low-cost CuS cathode is created using electrophoresis. High resolution scanning electron microscopy and X-ray diffraction were used to analyze the structure and morphology of structural cubic samples with diameters ranging from 40 nm to 200 nm. The conversion efficiency of solar cells is significantly impacted by the calcination temperatures of cathodes at 100 °C, 120 °C, 150 °C, and 180 °C under vacuum. The fluorine doped tin oxide (FTO)/CuS cathode electrode reached a maximum efficiency of 3.89% when it was calcined at 120 °C. Compared to other temperature combinations, CuS nanoparticles crystallize at 120 °C, which lowers resistance while increasing electron lifetime.
In place of the commercial Pt electrode used in quantum sensitized solar cells, the low-cost CuS cathode is created using electrophoresis. High resolution scanning electron microscopy and X-ray diffraction were used to analyze the structure and morphology of structural cubic samples with diameters ranging from 40 nm to 200 nm. The conversion efficiency of solar cells is significantly impacted by the calcination temperatures of cathodes at 100 °C, 120 °C, 150 °C, and 180 °C under vacuum. The fluorine doped tin oxide (FTO)/CuS cathode electrode reached a maximum efficiency of 3.89% when it was calcined at 120 °C. Compared to other temperature combinations, CuS nanoparticles crystallize at 120 °C, which lowers resistance while increasing electron lifetime.
A progressive learning for structural tolerance online sequential extreme lea...TELKOMNIKA JOURNAL
This article discusses the progressive learning for structural tolerance online sequential extreme learning machine (PSTOS-ELM). PSTOS-ELM can save robust accuracy while updating the new data and the new class data on the online training situation. The robustness accuracy arises from using the householder block exact QR decomposition recursive least squares (HBQRD-RLS) of the PSTOS-ELM. This method is suitable for applications that have data streaming and often have new class data. Our experiment compares the PSTOS-ELM accuracy and accuracy robustness while data is updating with the batch-extreme learning machine (ELM) and structural tolerance online sequential extreme learning machine (STOS-ELM) that both must retrain the data in a new class data case. The experimental results show that PSTOS-ELM has accuracy and robustness comparable to ELM and STOS-ELM while also can update new class data immediately.
Electroencephalography-based brain-computer interface using neural networksTELKOMNIKA JOURNAL
This study aimed to develop a brain-computer interface that can control an electric wheelchair using electroencephalography (EEG) signals. First, we used the Mind Wave Mobile 2 device to capture raw EEG signals from the surface of the scalp. The signals were transformed into the frequency domain using fast Fourier transform (FFT) and filtered to monitor changes in attention and relaxation. Next, we performed time and frequency domain analyses to identify features for five eye gestures: opened, closed, blink per second, double blink, and lookup. The base state was the opened-eyes gesture, and we compared the features of the remaining four action gestures to the base state to identify potential gestures. We then built a multilayer neural network to classify these features into five signals that control the wheelchair’s movement. Finally, we designed an experimental wheelchair system to test the effectiveness of the proposed approach. The results demonstrate that the EEG classification was highly accurate and computationally efficient. Moreover, the average performance of the brain-controlled wheelchair system was over 75% across different individuals, which suggests the feasibility of this approach.
Adaptive segmentation algorithm based on level set model in medical imagingTELKOMNIKA JOURNAL
For image segmentation, level set models are frequently employed. It offer best solution to overcome the main limitations of deformable parametric models. However, the challenge when applying those models in medical images stills deal with removing blurs in image edges which directly affects the edge indicator function, leads to not adaptively segmenting images and causes a wrong analysis of pathologies wich prevents to conclude a correct diagnosis. To overcome such issues, an effective process is suggested by simultaneously modelling and solving systems’ two-dimensional partial differential equations (PDE). The first PDE equation allows restoration using Euler’s equation similar to an anisotropic smoothing based on a regularized Perona and Malik filter that eliminates noise while preserving edge information in accordance with detected contours in the second equation that segments the image based on the first equation solutions. This approach allows developing a new algorithm which overcome the studied model drawbacks. Results of the proposed method give clear segments that can be applied to any application. Experiments on many medical images in particular blurry images with high information losses, demonstrate that the developed approach produces superior segmentation results in terms of quantity and quality compared to other models already presented in previeous works.
Automatic channel selection using shuffled frog leaping algorithm for EEG bas...TELKOMNIKA JOURNAL
Drug addiction is a complex neurobiological disorder that necessitates comprehensive treatment of both the body and mind. It is categorized as a brain disorder due to its impact on the brain. Various methods such as electroencephalography (EEG), functional magnetic resonance imaging (FMRI), and magnetoencephalography (MEG) can capture brain activities and structures. EEG signals provide valuable insights into neurological disorders, including drug addiction. Accurate classification of drug addiction from EEG signals relies on appropriate features and channel selection. Choosing the right EEG channels is essential to reduce computational costs and mitigate the risk of overfitting associated with using all available channels. To address the challenge of optimal channel selection in addiction detection from EEG signals, this work employs the shuffled frog leaping algorithm (SFLA). SFLA facilitates the selection of appropriate channels, leading to improved accuracy. Wavelet features extracted from the selected input channel signals are then analyzed using various machine learning classifiers to detect addiction. Experimental results indicate that after selecting features from the appropriate channels, classification accuracy significantly increased across all classifiers. Particularly, the multi-layer perceptron (MLP) classifier combined with SFLA demonstrated a remarkable accuracy improvement of 15.78% while reducing time complexity.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Kernal based speaker specific feature extraction and its applications in iTaukei cross language speaker recognition
1. TELKOMNIKA Telecommunication, Computing, Electronics and Control
Vol. 18, No. 5, October 2020, pp. 2488~2497
ISSN: 1693-6930, accredited First Grade by Kemenristekdikti, Decree No: 21/E/KPT/2018
DOI: 10.12928/TELKOMNIKA.v18i5.14655 2488
Journal homepage: http://journal.uad.ac.id/index.php/TELKOMNIKA
Kernal based speaker specific feature extraction and its
applications in iTaukei cross language speaker recognition
Satyanand Singh1
, Pragya Singh2
1
School of Electrical and Electronics Engineering, Fiji National Universisty, Fiji
2
School of Public Health and Primary Care, Fiji National Universisty, Fiji
Article Info ABSTRACT
Article history:
Received Nov 21, 2019
Revised May 2, 2020
Accepted May 11, 2020
Extraction and classification algorithms based on kernel nonlinear features are
popular in the new direction of research in machine learning. This research
paper considers their practical application in the iTaukei automatic speaker
recognition system (ASR) for cross-language speech recognition. Second,
nonlinear speaker-specific extraction methods such as kernel principal
component analysis (KPCA), kernel independent component analysis (KICA),
and kernel linear discriminant analysis (KLDA) are summarized.
The conversion effects on subsequent classifications were tested in
conjunction with Gaussian mixture modeling (GMM) learning algorithms; in
most cases, computations were found to have a beneficial effect on
classification performance. Additionally, the best results were achieved by
the Kernel linear discriminant analysis (KLDA) algorithm. The performance
of the ASR system is evaluated for clear speech to a wide range of speech
quality using ATR Japanese C language corpus and self-recorded iTaukei
corpus. The ASR efficiency of KLDA, KICA, and KLDA technique for 6 sec
of ATR Japanese C language corpus 99.7%, 99.6%, and 99.1% and equal error
rate (EER) are 1.95%, 2.31%, and 3.41% respectively. The EER improvement
of the KLDA technique-based ASR system compared with KICA and KPCA
is 4.25% and 8.51% respectively.
Keywords:
Automatic speaker recognition
system
Kernel independent component
analysis
Kernel linear discriminant
analysis
Kernel principal component
analysis
Principal component analysis
This is an open access article under the CC BY-SA license.
Corresponding Author:
Satyanand Singh,
School of Electrical and Electronics Engineering,
Fiji National Universisty, Fiji.
Email: satyanand.singh@fnu.ac.fj
1. INTRODUCTION
ASR is implemented using very conventional statistical modeling techniques such as GMM or ANN
modeling. But in the past few years, machine learning theory has evolved into a variety of new algorithms for
learning and classification. The so-called kernel-based method, in particular, has recently become a promising
new path to science. Kernel-based classification and regression techniques like the well-known SVM found
a fairly slow expression. That may be because to address theoretical and practical problems it needs to be
applied to large tasks such as speech recognition. Recently, however, more and more authors have been
concerned about the use of support vector machines in speech recognition [1].
Besides using kernel-based classifiers, an alternate way is to use kernel-based technologies only to
convert the feature space and leave the classification job to more conventional methods. The purpose of this
paper is to study the applicability of some of these methods to classify phonemes, using kernel-based
pre-learning speaker-specific feature extraction methods to improve ASR classification rates. This paper
mainly discusses KPCA [2], KICA [3], KLDA.
2. TELKOMNIKA Telecommun Comput El Control
Kernal based speaker specific feature extraction and its applications in… (Satyanand Singh)
2489
Usually, a traditional ASR process consists of two phases: a training phase, and a test phase.
In the training phase, the device extracts speaker-specific characteristics from the speech signal to be used to
create a speaker model [1], where the aim of the test phase is to determine the speaking samples that fit
the individual of the training sample. he original speech signal is transformed into a vector representation of
the function [2] in all audio signal processing. Linear prediction cepstral coefficients (LPCC) and perceptual
linearity predicted cepstrum coefficient (PLPCC), mal frequency cepstral coefficient (MFCC) [3] approach is
most commonly used in the ASR system to obtain speaker-specific features. For modeling, discriminant
classifiers in support vector machine (SVM) [4] representation have achieved impressive results in many ASR
systems. SVM will definitely effectively train non-linear boundaries for decision-making by classifying
interesting speakers/imposters as they are distinct.
Although these feature extraction techniques are effective, non-linear mapping of speech features to
new suitable spaces may generate new features that can better identify speech categories. Kernel-based
technology has been applied to a variety of learning machines, including support vector machines (SVM),
Kernel discriminant analysis (KDA), kernel principal component analysis (KPCA) [5]. The latter two methods
are widely used in image recognition. Their performance in speaker recognition, however, has not been
carefully investigated.
The purpose of this paper is to examine the applicability of some of these methods to classify
phonemes, using kernel-based feature extraction methods applied before learning to boost classification levels.
Essentially, this paper deals with the strategies of KPCA, KICA [6, 7], KLDA [5], and Kernel springy
discriminant analysis (KSDA) [8]. In this work, KPCA, KICA, and KLDA is used for speaker specific feature
extraction with an ASR system. With KPCA, speaker-specific features can be expressed in a high dimension
space which can possibly generate more distinguishable speaker features.
2. FUNDAMENTAL OF PRINCIPAL COMPONENT ANALYSIS
Principal component analysis (PCA) is a very common method of dimensionality reduction and
feature extraction. PCA attempts to find linear subspaces that are smaller in size than the original feature space,
with new features having the largest variance [9]. Consider the data set {𝑥𝑖} where 𝑖 = 1, 2, 3, … . , 𝑁, each 𝑥𝑖
is a D-dimensional vector. Now we project the data into the 𝑀-dimensional subspace, here, 𝑀 < 𝐷.
The projection is represented as 𝑦 = 𝐴𝑥, where 𝐴 = [𝑢1
𝑇
, 𝑢2
𝑇
, … . , 𝑢 𝑀
𝑇 ], 𝑎𝑛𝑑 𝑢 𝑘
𝑇
𝑢 𝑘 = 1 for 𝑘 = 1,2,3, … . , 𝑀.
We want to maximize the variance of {𝑦𝑖}, which is the trace of the covariance matrix of {𝑦𝑖}.
𝐴∗
= arg
𝑚𝑎𝑥
𝐴
𝑡𝑟(𝑆 𝑦) (1)
where,
𝑆 𝑦 =
1
𝑁
∑ (𝑦𝑖 − 𝑦̅)(𝑦𝑖 − 𝑦̅) 𝑇𝑁
𝑖=1 (2)
and
𝑦̅ =
1
𝑁
∑ 𝑋𝑖
𝑁
𝑖=1 (3)
Covarience matrix of {𝑋𝑖} is the 𝑆 𝑋, since 𝑡𝑟(𝑆 𝑦) = 𝑡𝑟(𝐴𝑆 𝑋 𝐴 𝑇), by using the Lagrangian multiplier
and taking the derivative, we get,
𝑆 𝑋 𝑢 𝑘 = 𝜆 𝑘 𝑢 𝑘 (4)
which indicates 𝑢 𝑘 is the eigenvector of 𝑆 𝑋 and now 𝑋𝑖 can be represented as follows;
𝑋𝑖 = ∑ (𝑋𝑖
𝑇
𝑢 𝑘)𝐷
𝑘=1 𝑢 𝑘 (5)
𝑋𝑖 can be approximated by 𝑋̃𝑖 and expressed as follows:
𝑋̃ = ∑ (𝑋𝑖
𝑇
𝑢 𝑘)𝐷
𝑘=1 𝑢 𝑘 (6)
where 𝑢 𝑘is the eigenvector of 𝑆 𝑋corresponding to the kth largest eigenvalue. Standard PCA results for
the two-speaker’s audio data shown in Figure 1 (a).
3. ISSN: 1693-6930
TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 5, October 2020: 2488 - 2497
2490
2.1. Kernel PCA methodology for dimensionality reduction in ASR system
Standard PCA only allows linear size reduction. However, standard PCA is not very useful when
the data has a more complex structure that cannot be represented well in linear subspaces. Fortunately,
the kernel PCA allows us to extend the standard PCA to nonlinear dimensionality reduction [10]. Assume that
a set of observations is given 𝑋𝑖 ∈ ℝ 𝑛
, 𝑖 = 1, 2, 3, … . , 𝑚. Consider the inner dot product space 𝐹 associated
with the input space by a map 𝜙: ℝ 𝑛
→ 𝐹 may be non-linear. The feature space 𝐹 has an arbitrary size and
in some cases has an infinite dimension. Here, uppercase letters used for elements of 𝐹, and lowercase letters
are used for elements of ℝ 𝑛
. Suppose we are working on centered data ∑ 𝜙(𝑋𝑖) = 0𝑚
𝑖=1 . In F, the covariance
matrix has the form as follows:
𝐶 =
1
𝑚
∑ 𝜙(𝑋𝑗)𝜙(𝑋𝑗)
𝑇𝑚
𝑗=1 (7)
Eigenvalues 𝜆 ≥ 0 and nonzero eigenvectors 𝑉 ∈ 𝐹(0) satisfying 𝐶𝑉 = 𝜆𝑉. It is well known that all
solutions 𝑉 with 𝜆 ≠ 0 are in the range of {𝜙(𝑋𝑖)}𝑖=1
𝑚
. This has two consequences. First, consider a set of
equations 〈𝜙(𝑋 𝑘), 𝐶𝑉〉 = 𝜆〈𝜙(𝑋 𝑘), 𝑉〉, for all 𝑘 = 1,2,3, … , 𝑚 and second there exist coefficients
𝛼𝑖, 𝑖 = 1,2,3, … , 𝑚 in such a way that 𝑉 = ∑ 𝛼𝑖 𝜙(𝑋𝑖)𝑚
𝑖=1 . Combining 〈𝜙(𝑋 𝑘), 𝐶𝑉〉 = 𝜆〈𝜙(𝑋 𝑘), 𝑉〉 and
𝑉 = ∑ 𝛼𝑖 𝜙(𝑋𝑖)𝑚
𝑖=1 we get the dual representation of the eigenvalue problem as
1
𝑚
∑ 𝛼𝑖〈𝜙(𝑋 𝑘), ∑ 𝜙(𝑋𝑗)〈𝜙(𝑋𝑗), 𝜙(𝑋𝑖)〉𝑚
𝑗=1 〉 = 𝜆 ∑ 𝛼𝑖〈𝜙(𝑋 𝑘), 𝜙(𝑋𝑖)〉𝑚
𝑖=1
𝑚
𝑖=1 for all 𝑘 = 1,2,3, … , 𝑚. We are
defining a 𝑚𝑋𝑚 matrix by 𝐾𝑖𝑗 = 𝜙(𝑋𝑖), 𝜙(𝑋𝑗), this makes 𝐾2
𝛼 = 𝑚𝜆𝐾𝛼. Where 𝛼 denoted as a column
vectors with 𝛼1, 𝛼2, 𝛼3, … . , 𝛼 𝑚 entries.
Let 𝜆1 ≥ 𝜆2 ≥ ⋯ 𝜆 𝑚 be the eigenvalue of 𝐾, 𝛼1
, 𝛼2
, … , 𝛼 𝑚
be the set of corresponding eigenvectors,
and 𝜆 𝑟 be the last non-zero eigenvalue. Normalizing 𝛼1
, 𝛼2
, … , 𝛼 𝑟
by needing the corresponding vectors in 𝐹
be normalized 〈𝑉 𝑘
, 𝑉 𝑘〉 = 1, for all 𝑘 = 1,2, … , 𝑟. Considering 𝑉 = ∑ 𝛼𝑖 𝜙(𝑋𝑖)𝑚
𝑖=1 and 𝐾𝛼 = 𝑚𝜆𝛼,
the normalization condition of 𝛼1
, 𝛼2
, … , 𝛼 𝑟
can be rewritten as follows;
1 = ∑ 𝛼𝑖
𝑘𝑚
𝑖,𝑗 𝛼𝑗
𝑘
〈𝜙(𝑥𝑖), 𝜙(𝑥𝑗)〉 = ∑ 𝛼𝑖
𝑘𝑚
𝑖,𝑗 𝛼𝑗
𝑘
𝐾𝑖,𝑗 = 〈𝛼 𝑘
, 𝐾𝛼 𝑘〉 = 𝜆 𝑘〈𝛼 𝑘
, 𝛼 𝑘〉 (8)
for the purpose of principal component extraction, we need to compute the projections onto
the eigenvectors 𝑉 𝑘
in 𝐹, for 𝑘 = 1,2, … , 𝑟. Let 𝑦 be the test point, with an image 𝜙(𝑦) in 𝐹.
〈𝑉 𝑘
, 𝜙(𝑦)〉 = ∑ 𝛼𝑖
𝑘〈𝜙(𝑥𝑖), 𝜙(𝑦)〉𝑚
𝑖=1 (9)
〈𝑉 𝑘
, 𝜙(𝑦)〉 nonlinear principal component corresponding to 𝜙.
2.2. Computation of covariance matrix and dot product matrix by positioning on feature space
For the sake of simplicity, we assume that the observations are at the center. This is easy to implement
in the input space because it is not possible to explicitly calculate the average of the observations mapped with
𝐹, but it is more difficult to use 𝐹. Assume that any 𝜙 and any series of observations 𝑋1, 𝑋2, … , 𝑋 𝑚 are given
then let us define 𝜙̅ =
1
𝑚
∑ 𝜙(𝑋𝑖)𝑚
𝑖=1 and then the point 𝜙̃(𝑋𝑖) = 𝜙(𝑋𝑖) − 𝜙̅ will be centered. Therefore,
the above assumption holds, we defin the covariance matrix and the dot product matrix 𝐾̃𝑖𝑗 = 〈𝜙̃(𝑋𝑖), 𝜙̃(𝑋𝑗)〉
in 𝐹. We known eigenvalue problems as 𝑚𝜆̃ 𝛼̃ = 𝐾̃ 𝛼̃ with 𝛼̃ is the expansion coefficient of the eigenvector
relative to the center point 𝜙̃(𝑋𝑖). Since there is no central data, 𝐾̃ cannot be explicitly calculated, but it can be
represented by a corresponding 𝐾 without a center therefore 𝐾̃𝑖𝑗 = 〈𝜙̃(𝑋𝑖) − 𝜙, 𝜙(𝑋𝑗) − 𝜙〉 =𝐾𝑖,𝑗 −
1
𝑚
∑ 𝑘𝑖𝑡 −
1
𝑚
∑ 𝑘 𝑠𝑗 +
1
𝑚2
∑ 𝑘 𝑠𝑡
𝑚
𝑠,𝑡=1
𝑚
𝑠=1
𝑚
𝑡=1 . We can get more compact expression by using the vector
1 𝑚 = (1, … ,1) 𝑇
. The compact expression is 𝐾̃ = 𝐾 −
1
𝑚
𝐾1 𝑚1 𝑚
𝑇
−
1
𝑚
1 𝑚1 𝑚
𝑇
𝐾 +
1
𝑚2
(1 𝑚
𝑇
𝐾1 𝑚)1 𝑚
𝑇
𝐾1 𝑚. We can
calculate 𝐾̃ from 𝐾 and solve the eigenvalue problem. Consider test point 𝑌 projection of the center point of
the center 𝜙-image of 𝑌 to the feature vector of the covariance matrix is computed to find its coordinates [11].
〈𝜙̃(𝑌), 𝑉̃ 𝑘〉 = 〈𝜙(𝑌) − 𝜙⃑ , 𝑉̃ 𝑘〉 = ∑ 𝛼̃ 𝑖
𝑘
〈𝜙(𝑌) − 𝜙⃑ , 𝜙(𝑋𝑖) − 𝜙⃑ 〉𝑚
𝑖=1
= ∑ 𝛼̃ 𝑖
𝑘𝑚
𝑖=1 {𝐾(𝑌, 𝑋𝑖) −
1
𝑚
∑ 𝐾(𝑋𝑠, 𝑋𝑖) −
1
𝑚
∑ 𝐾(𝑌, 𝑋𝑠) +
1
𝑚2
∑ 𝐾(𝑋𝑠, 𝑋𝑡)𝑚
𝑠,𝑡=1
𝑚
𝑠=1
𝑚
𝑠=1 } (10)
Introducing the vector 𝑍.
4. TELKOMNIKA Telecommun Comput El Control
Kernal based speaker specific feature extraction and its applications in… (Satyanand Singh)
2491
𝑍 = (𝐾(𝑌, 𝑋𝑖))
𝑚𝑋1
(11)
(〈𝜙̃(𝑌), 𝑉̃ 𝑘〉)
1𝑋𝑟
= 𝑍 𝑇
𝑉̃ −
1
𝑚
1 𝑚
𝑇
𝐾𝑉̃ −
1
𝑚
(𝑍 𝑇
1 𝑚)1 𝑚
𝑇
𝑉̃ +
1
𝑚2
(1 𝑚
𝑇
𝐾1 𝑚)1 𝑚
𝑇
𝑉̃
= 𝑍 𝑇
(𝐼 𝑚 −
1
𝑚
1 𝑚1 𝑚
𝑇
) 𝑉̃ −
1
𝑚
1 𝑚
𝑇
𝐾 (𝐼 𝑚 −
1
𝑚
1 𝑚1 𝑚
𝑇
) 𝑉̃
= (𝑍 𝑇
−
1
𝑚
1 𝑚
𝑇
𝐾) (𝐼 𝑚 −
1
𝑚
1 𝑚1 𝑚
𝑇
) 𝑉̃ (12)
Note that KPCA implicitly uses only input variables because the algorithm uses kernel function evaluation to
represent the reduction in feature space dimensions. Therefore, KPCA is useful for nonlinear feature extraction
by reducing the size; it does not explain the characteristics of the input variable selection.
3. KERNEL- BASED SPEAKER SPECIFIC FEATURE EXTRACTION AND ITS APPLICATION
IN ASR
Classification algorithms must represent the objects to be classified as points in a multidimensional
feature space. However, one can apply other vector space transformations to the initial features before running
the learning algorithm. There are two reasons for doing this. First, they can improve the performance of
classification and second, they can reduce the data's dimensionality. The selection of initial features and their
transformation are sometimes dealt with in the literature under the title "feature extraction”. To avoid
misunderstanding, this section describes only the latter and describes the first feature set. Hopefully it will be
more effective and classification will be faster. The approach to the extraction of features may be either linear
or nonlinear, but there is a technique that breaks down the barrier between the two forms in some way.
The key idea behind the kernel technique was originally presented in [12] and applied again in connection with
the general purpose SVM [13-15] followed by other kernel-based methods.
3.1. Supplying input variable information into kernel PCA
Additional information to the KPCA representation for interpretability. We have developed a process
to project a given input variable into a subspace spanned by feature vectors 𝑉̃ = ∑ 𝛼̃𝜙̃(𝑋1)𝑚
𝑖=1 . We can think
of our observation as a random vector 𝑋 = (𝑋1, 𝑋2, … . . , 𝑋 𝑛 ) implementation then to represent the prominence
of the input variable 𝑋 𝑘 in the KPCA. Considering a set of points of mathematical forms 𝑦 = 𝑎 + 𝑠𝑒 𝑘 ∈ ℝ 𝑛
where 𝑒 𝑘 = (0, … . ,1, … . ,0) of kth component is either 0 or 1. Next, the projection points 𝜙(𝑦) of these images
onto the subspace spanned by the feature vector 𝑉̃ = ∑ 𝛼̃𝜙̃(𝑋1)𝑚
𝑖=1 can be calculated. Considering in (12)
the row vector gives the induction curve in the Eigen space expressed in matrix form:
𝜎(𝑠)1𝑋𝑟 = (𝑍𝑠
𝑇
−
1
𝑚
1 𝑚
𝑇
𝐾) (𝐼 𝑚 −
1
𝑚
1 𝑚1 𝑚
𝑇
) 𝑉̃ (13)
Furthermore, by projecting the tangent vector to s = 0, we can express the maximum change direction
of 𝜎(𝑠) associated with the variable 𝑋 𝑘. Matrix form of the expression represented as follows:
𝑑𝜎
𝑑𝑠
|
𝑠=0
=
𝑑𝑍 𝑠
𝑇
𝑑𝑠
|
𝑠=0
(𝐼 𝑚 −
1
𝑚
1 𝑚1 𝑚
𝑇
) 𝑉̃ (14)
where
𝑑𝑍 𝑠
𝑇
𝑑𝑠
|
𝑠=0
= (
𝑑𝑍 𝑠
1
𝑑𝑠
|
𝑠=0
, … … . . ,
𝑑𝑍 𝑠
𝑚
𝑑𝑠
|
𝑠=0
)
𝑇
and
𝑑𝑍 𝑠
𝑖
𝑑𝑠
|
𝑠=0
=
𝑑𝐾(𝑌,𝑋 𝑖)
𝑑𝑠
|
𝑠=0
= (∑
𝜕𝐾(𝑌,𝑋 𝑖)
𝜕𝑌𝑡
𝑚
𝑡=1
𝑑𝑌𝑡
𝑑𝑠
)|
𝑠=0
= ∑
𝜕𝐾(𝑌,𝑋 𝑖)
𝜕𝑌𝑡
𝑚
𝑡=1 |
𝑌=𝑎
𝛿𝑡
𝑘
=
𝜕𝐾(𝑌,𝑋 𝑖)
𝜕𝑌 𝑘
|
𝑌=𝑎
where delta of Kronecker is represented as 𝛿𝑡
𝑘
and radial basis kernel as 𝑘(𝑌, 𝑋𝑖) = 𝑒𝑥𝑝(−𝑐‖𝑌 − 𝑋𝑖‖2) =
𝑒𝑥𝑝(−𝑐 ∑ (𝑌𝑖 − 𝑋𝑖𝑡)2𝑛
𝑡=1 ). After considering 𝑦 = 𝑎 + 𝑠𝑒 𝑘 ∈ ℝ 𝑛
:
𝑑𝑍𝑠
𝑖
𝑑𝑠
|
𝑠=0
=
𝜕𝐾(𝑌, 𝑋𝑖)
𝜕𝑌𝑘
|
𝑦=𝑎
= −2𝑐𝐾(𝑎, 𝑋𝑖)(𝑎 𝑘 − 𝑋𝑖𝑘) = −2𝑐𝐾(𝑋𝛽, 𝑋𝑖)(𝑋𝛽𝑘 − 𝑋𝑖𝑘)
5. ISSN: 1693-6930
TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 5, October 2020: 2488 - 2497
2492
where the training point 𝑎 = 𝑋𝛽. Thus, by applying (13), it is possible to locally represent any given input
variable plot in KPCA. Furthermore, by using (14), it is possible to represent the tangent vector associated with
any given input variable at each sample point [16]. Therefore, a vector field can be drawn on KPCA indicating
the growth direction of a given variable.
There are some existing techniques to compute z for specific kernels [17]. For a Gaussian kernel
(𝑋, 𝑌)) = 𝑒𝑥𝑝(−‖𝑋 − 𝑌‖2
/2𝜎2) , 𝑧 must satisfy the following condition;
𝑍 =
∑ 𝛾 𝑖(‖𝑍−𝑋𝑖‖2/2𝜎2)𝑋𝑖
𝑚
𝑖=1
∑ 𝛾 𝑖
𝑚
𝑖=1 (−‖𝑍−𝑋𝑖‖2)/2𝜎2 (15)
Kernel PCA results for the two- speaker’s audio data with is shown in Figure 1 (b).
(a)
(b)
Figure 1. Standard PCA and Kernel PCA results for the two-speaker’s audio data;
(a) standard PCA and (b) kernel PCA
3.2. Application of kernel independent component analysis (KICA)
Independent component analysis is a general statistical approach originally born from the study of
separation from blind sources. Another application of ICA is the unsupervised extraction of features. This is
intended to transform input data linearly into uncorrelated elements, using at least a distribution of the Gaussian
sample set [18]. The explanation for this is that classification of data in certain directions would be simpler.
This is in accordance with the most popular speech modeling technique, i.e. fitting Gaussian mixtures on each
6. TELKOMNIKA Telecommun Comput El Control
Kernal based speaker specific feature extraction and its applications in… (Satyanand Singh)
2493
class. This obviously means that Gaussian mixtures can approximate the distributions of the groups KICA
extends this by assuming, on the contrary, that when all classes are fused, the distribution is not Gaussian; thus,
using non-Gaussianism as a heuristic for the uncontrolled extraction of features would prefer those directions
which separate classes.
Several objective functions for optimal selection of independent directions were described using
approximately equivalent approaches. The KICA algorithm's goal itself is to find such objective functions as
optimally as possible [19]. For KICA output most iterative methods are available. Others need to be
preprocessed, i.e. focused and whitened while others do not. Overall, experience shows that all of these
algorithms can converge faster with oriented and whitewashed data, even those that don't really need it [20].
Let's first investigate how the centering and whitening pre-processing steps can be done in the kernel
function space. To this end, allow the kernel function 𝜘 in ℱ to implicitly define the inner product with
the associated transformation 𝜙. Step one Centering ℱ - Shifting the data 𝜙(𝑋1), 𝜙(𝑋2), … . . , 𝜙(𝑋 𝑘) along with
its mean 𝐸{𝜙(𝑋)} to get the data as follows:
{
𝜙′(𝑋1) = 𝜙(𝑋1) − 𝐸{𝜙(𝑋)}
𝜙′(𝑋2) = 𝜙(𝑋2) − 𝐸{𝜙(𝑋)}
.
.
𝜙′(𝑋 𝑘) = 𝜙(𝑋 𝑘) − 𝐸{𝜙(𝑋)}
(16)
Step two Whitening in ℱ. Transfoprming the centered samples 𝜙′(𝑋1), 𝜙′(𝑋2), … . , 𝜙′(𝑋 𝑘) via an orthogonal
transformation 𝑄 into its vectors 𝜙̂(𝑋1) = 𝑄𝜙′(𝑋1), 𝑄𝜙′(𝑋2), … , 𝑄𝜙′(𝑋 𝑘) = 𝑄𝜙(𝑋 𝑘
′ ) . 𝐶̂ = is the covariance
matrix. Because standard PCA converts the covariance matrix into a diagonal form just like its kernel based
equivalent, where the diagonal elements are the unique values of the data covariance matrix 𝐸{𝜙̂(𝑋)𝜙̂(𝑋) 𝑇
},
all that remains is to transform the diagonal element into 1. Based on this finding, a slight modification of
the formulas provided in the KPCA section will obtain the necessary whitening transformation [21]. Here
(𝛼1 𝜆1), (𝛼2 𝜆2), … . . , (𝛼 𝑘 𝜆 𝑘) and 𝜆1 ≥ 𝜆2 ≥ 𝜆 𝑘 are the eighpairs of 𝐸{𝜙̂(𝑋)𝜙̂(𝑋) 𝑇
} then the transformation
matrix 𝑄 will take a form[𝜆1
−
1
2
𝛼1, 𝜆2
−
1
2
𝛼2, … . . 𝜆 𝑚
−
1
2
𝛼 𝑚]
𝑇
. Kernel Independent component analysis results for
the two-speaker’s audio data is shown in Figure 2 (a).
3.3. Application of kernel linear discriminant analysis (KLDA)
LDA is a conventional, supervised method of extracting speaker-specific characteristics [19] that has
proven to be one of the most effective pre-processing classification techniques. It has also long been used
in speech recognition [22]. The main goal of LDA is to find a new orthogonal data set to provide the optimal
class separation.
In KLDA we are essentially following the discussion of its linear counterpart, except in this case this
is intended to happen implicitly in the kernel feature space F. Let's say again that a kernel function with
a feature map and a kernel field space has been chosen. In order to define the transformation matrix 𝐴 of KLDA,
we define the objective function first as Γ ∶ ℱ → ℛ, because of the supervised nature of this method, it depends
not only on the test data 𝑋 but also on the indicator ℒ. Let's describe ubiquitous Γ(V).
Γ(V) =
𝑉 𝑇ℬ𝑉
𝑉 𝑇 𝒲𝑉
, 𝑉 ∈: ℱ{0} (17)
where ℬ is the scatter matrix of the interclass, while 𝒲 is the scatter matrix of the interclass. Here, the scatter
matrix ℬ between classes shows the scatter of the mean vectors 𝜇 𝑗 around the overall mean vector 𝜇.
ℬ = ∑
𝑘 𝑗
𝑘
(𝜇 𝑗 − 𝜇)𝑟
𝑗=1 (𝜇 𝑗 − 𝜇)
𝑇
; 𝜇 =
1
𝑘
∑ 𝜙(𝑥𝑖)𝑘
𝑖=𝑘 ; 𝜇 𝑗 =
1
𝑘 𝑗
∑ 𝜙(𝑥𝑖)ℒ(𝑖) (18)
with the class label 𝐽, the in-class scatter matrix 𝒲 represents the weighted average scatter of the sample vector
covariance matrices 𝐶𝑗.
𝒲 = ∑
𝑘 𝑗
𝑘
𝐶𝑗 ;𝑟
𝑗=1 𝐶𝑗 =
1
𝑘 𝑗
∑ (𝜙(𝑥𝑖) − 𝜇 𝑗) (𝜙(𝑥𝑖 − 𝜇 𝑗))
𝑇
ℒ(𝑖)=𝑗 (19)
Kernel linear discriminant analysis results for the two-speaker’s audio data is shown in Figure 2 (b).
7. ISSN: 1693-6930
TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 5, October 2020: 2488 - 2497
2494
(a)
(b)
Figure 2. Kernel independent component analysis and kernel linear discriminant
analysis results for the two-speaker’s audio data; (a) kernel independent component analysis
and (b) kernel lnear discriminant analysis
4. EXPERIMENTAL SETUP
To evaluate the efficiency of kernel-based speaker-specific feature extraction techniques, an isolated
word recognition experiment was performed. The experiment includes 520 Japanese words from the ATR
Japanese C language set Voice database, 80 speakers (40 men and 40 Female). Audio samples of 10 iTaukei
speakers were collected at random and under unfavourable conditions. The average duration of the training
samples was six seconds per speaker for all 10 speakers and out of twenty utterances of each speaker just one
was used for training purpose [23-27]. For matching purposes the remaining 19 voice samples were used from
the corpus. We have recorded utterances for this investigation were at one sitting for each speaker. The text for
the utterances was randomly selected by speaker. The main voice recordings consist of both male and female
speakers of twenty utterance of each using sampling rate of 16 kHz with 16 bits/sample.
Throughout the experiment, 10400 utterances were used as training data and the remaining 31,200
utterances were used as test data. The sampling rate of the audio signal is 10 kHz. 12 Mel-Cepstral coefficients
extracted using 25.6 ms Hamming windows with 10 ms shifts [28-32]. The features of KPCA were extracted
from 13 Mel-cepstral coefficients including zero coefficients corresponding to 39 vector coefficients and their
increment and acceleration coefficients. Around 1,000,000 frames were used as training data in this
experiment, and it is computationally impossible to calculate matrix K with this amount of data. N frames are
randomly picked from the training data to reduce the number of frames. The number N= 1024 was chosen to
make the system computationally feasible.
Table 1 represents efficiency and EER of the ASR system for KLDA, KICA and KLDA respectively
for ATR Japanese C language. Table 2 represents efficiency and EER of the ASR system for KLDA, KICA
and KLDA respectively for10 iTaukei speakers cross language. Figure 3 show the equal error rate (EER) of
8. TELKOMNIKA Telecommun Comput El Control
Kernal based speaker specific feature extraction and its applications in… (Satyanand Singh)
2495
KLDA KICA, and KPCA based modeling technique. The ASR efficiency of of KLDA KICA, and KPCA based
modeling technique are 99.9%, 99.6%, and 98.1% and EER are 4.7%, 4.9% and 5.1% respectively for 6 sec of
audio signal. The EER improvement of KLDA technique based ASR system compared with KICA and KPCA
is 4.25% and 8.51% respectively.
(a)
(b)
Figure 3. EER of KLDA, KICA and KLDA technique for 6 sec of voice data;
(a) ATR Japanese C language and (b) iTaukei speaker’s cross language
9. ISSN: 1693-6930
TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 5, October 2020: 2488 - 2497
2496
Table 1. Efficiency and EER of the ASR system for KLDA, KICA and KLDA respectively
for ATR Japanese C language
KLDA KICA KPCA
Efficiency in % EER in % Efficiency in % EER in % Efficiency in % EER in %
6 sec 99.7 1.95 99.6 2.31 99.1 3.41
4 sec 99.5 2.29 99.1 3.20 98.2 4.11
2 sec 98.8 3.23 98.3 4.32 97.6 5.3
Table 2. Efficiency and EER of the ASR system for KLDA, KICA and KLDA respectively
for10 iTaukei speakers cross language
KLDA KICA KPCA
Efficiency in % EER in % Efficiency in % EER in % Efficiency in % EER in %
6 sec 94.9 2.04 94.6 3.4 94.1 4.1
4 sec 94.3 2.34 94.1 3.7 93.5 4.8
2 sec 93.5 3.2.0 93.1 4.1 92.6 5.3
5. CONCLUSION
An experimental evaluation of the performance of the ASR system has been done on 6 sec of voice
data of ATR Japanese C language. For the 10400, voice samples of the ATR Japanese C language speaker
recognition accuracy 99.7%, 99.6%, and 99.1% and equal error rate (EER) is 1.95%, 2.31%, and 3.41%
respectively for KLDA, KICA, and KPCA. The EER improvement of the KLDA technique-based ASR system
compared with KICA and KPCA is 4.25% and 8.51% respectively. We find that non-linear transformations
usually lead to better classification than non-linear transformations, and are therefore a promising new research
direction. We also found that the supervised transformations are usually stronger than those not supervised.
We think it would be worth searching for other supervised approaches which could be built similarly to
the KLDA or KICA-based ASR application methodology. Such transformations significantly improved
the phonological knowledge ASR training framework by providing a comprehensive and accurate
classification of speaking contextual features unique to real-time speakers.
REFERENCES
[1] S. Singh, “Support Vector Machine Based Approaches For Real Time Automatic Speaker Recognition System,”
International Journal of Applied Engineering Research, pp. 8561-8567, 2018.
[2] B. Schölkopf, A. J. Smola, and K. R. Muller et al., “Kernel Principal Component Analysis,” Advances in Kernel
Methods: Support Vector Learning, pp. 327–352, 1999.
[3] F. R. Bach and M. I. Jordan, “Kernel independent component analysis,” Journal of Machine Learning Research,
vol. 3, no. 2002, pp. 1-48, 2002.
[4] G. Baudat and F. Anouar, “Generalized discriminant analysis using a kernel approach,” Neural Comput., vol. 12,
no. 10, pp. 2385–2404, 2000.
[5] A. Kocsor and K. Kovács et al., “Kernel springy discriminant anlysis and its application to a phonological awareness
teaching system,” International Conference on Text, Speech and Dialogue, vol. 2448, pp. 325-328, 2002.
[6] F. R. Bach and M. I. Jordan, “Kernel independent component analysis,” J. Machine Learning Res., vol. 3,
pp. 1–48, 2002.
[7] A. Kocsor and J. Csirik, “Fast independent component analysis in kernel feature spaces,” SOFSEM 2001: Theory
and Practice of Informatics, 28th Conference on Current Trends in Theory and Practice of Informatics Piestany,
vol. 2234, pp. 271–281, 2001.
[8] A. Kocsor and K. Kovács et al., “Kernel springy discriminant anlysis and its application to a phonological awareness
teaching system,” International Conference on Text, Speech and Dialogue, vol. 2448, pp. 325–328, 2002.
[9] D. Lay, “Linear Algebra and its applications,” 4th ed., Pearson, 2012
[10] B. Scholkopf, A. J. Smola, K.R. Muller, “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural
Computation, vol. 10, pp. 1299-1319, 1998.
[11] Weinberger, Kilian Q., Packer, Benjamin D., and Saul, Lawrence K. “Nonlinear dimensionality reduction by
semidefinite programming and kernel matrix factorization,” Proceedings of the Tenth International Workshop on
Artificial Intelligence and Statistics, pp. 381-388, 2005.
[12] Gerazov, B.; Ivanovski, Z. “Kernel Power Flow Orientation Coefficients for Noise-Robust Speech Recognition,”
IEEE/ACM Trans. Audio Speech Lang. Process, vol. 23, pp.407-419, 2015.
[13] J. Geiger, B. Schuller, G. Rigoll, “Large-scale audio feature extraction and SVM for acoustic scene classification,”
Proceedings of the 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),
New Paltz, NY, USA, 20–23 October, pp. 1-4, 2013.
10. TELKOMNIKA Telecommun Comput El Control
Kernal based speaker specific feature extraction and its applications in… (Satyanand Singh)
2497
[14] A. Rabaoui, M. Davy, S. Rossignaol, N. Ellouze, “Using One-Class SVMs and Wavelets for Audio Surveillance,”
IEEE Trans. Inf. Forensics Secur, vol. 3, 763-775, 2018.
[15] H. Jiang, J. Bai, S. Zhang, B. Xu, “SVM-based audio scene classification,” Proceedings of the 2005 IEEE
International Conference on Natural Language Processing and Knowledge Engineering, Wuhan, China,
pp. 131-136, 2005.
[16] S. Mika, B. Scholkopf, A. J. Smola, K. R. Muller, M. Scholz, and G. Ratsch, “Kernel PCA and de–noising in feature
spaces,” in M. S. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems,
volume 11, pages 536–542. MIT Press, 1999.
[17] K. I. Kim, K. Jung, H. J. Kim, “Face Recognition Using Kernel Principal Component Analysis,” IEEE Signal
Processing Letters, vol. 9, no. 2, pp. 40-42, February 2002).
[18] L. B. Almeida, “MISEP - linear and nonlinear ICA based on mutual information,” Journal of Machine Learning
Research, vol. 4, no. 2002, pp.1297-1318, 2003.
[19] K. Zhang, L. Chan, “Nonlinear independent component analysis with minimum nonlinear distortion,” ICML 2007,
Corvallis, OR, US, pp. 1127-1134, 2007.
[20] M. S. Bartlett, J. R. Movellan, T. J. Sejnowski, “Face Recognition by Independent Component Analysis,” IEEE
Transactions on Neural Networks, vol. 13, no. 6, pp. 1450-1464, 2002.
[21] F. R. Bach, M. I. Jordan, “Kernel Independent Component Analysis,” J. Machine Learning Res, vol. 3, no. 2002,
pp.1-48, 2002.
[22] O. Siohan, “On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition,”
Proc. ICASSP, Detroit, MI, pp. 125-128, 1995.
[23] S. Singh, EG Rajan,” Application of different filters in Mel frequency cepstral coefficients feature extraction and
fuzzy vector quantization approach in speaker recognition,” International Journal of Engineering Research &
Technology, vol. 2, no. 6, pp. 419-425, 2013.
[24] G. S. Morrison, and F. Kelly, “A statistical procedure to adjust for time-interval mismatch in forensic voice
comparison” Speech Communication, vol. 112, pp. 15- 21, 2019.
[25] S. Singh and Ajeet Singh “Accuracy Comparison using Different Modeling Techniques under Limited Speech Data
of Speaker Recognition Systems,” Global Journal of Science Frontier Research: F Mathematics and Decision
Sciences, vol. 16, no. 2, pp. 1-17, 2016.
[26] P. Boersma and D. Weenink, “Praat: doing phonetics by computer”, version 6.0.37, 2020. Available:
http://www.praat.org/, 2020
[27] S. Singh. “The Role of Speech Technology in Biometrics, Forensics and Man-Machine Interface” International
Journal of Electrical and Computer Engineering, vol. 9, no. 1, pp. 281-288, 2019.
[28] Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang, Junichi Yamagishi, Yu Tsao, and Hsin-Min Wang,
“MOSNet: Deep Learning based Objective Assessment for Voice Conversion,” Interspeech ISCA,
pp. 1542–1545, 2019.
[29] D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, and S. Khudanpur, “X-vectors: Robust DNN embeddings for
speaker recognition,” ICASSP 2018, 2018.
[30] Galina Lavrentyeva, Sergey Novoselov, Andzhukaev Tseren, Marina Volkova, Artem Gorlanov, and Alexandr
Kozlov, “STC Antispoofing Systems for the ASVspoof2019 Challenge,” Proc. Interspeech 2019,
pp. 1033-1037, 2019.
[31] S. Singh and Mansour H. Assaf, “A Perfect Balance of Sparsity and Acoustic hole in Speech Signal and Its
Application in Speaker Recognition System,” Middle-East Journal of Scientific Research, vol. 24, no.11,
pp. 3527-3541, 2016.
[32] A. Alexander, O. Forth, A. A. Atreya, F. Kelly, “VOCALISE: A Forensic Automatic Speaker Recognition System
Supporting Spectral, Phonetic, and User-Provided Features,” Research and Development Oxford Wave Research Ltd,
United Kingdom, 2016.