This document analyzes a modified local binary pattern (MLBP) method for extracting features from speech signals. The MLBP method represents speech signals with a small set of values to increase processing speed for applications like security systems. The document describes the MLBP method and presents results of experiments applying it to different speech signals. The experiments show the MLBP method extracts unique features for each speech, is stable across runs, and is efficient, providing an average extraction time of 0.006 seconds with significant speed increases over other common feature extraction methods.
Comparison of Interpolation Methods in Prediction the Pattern of Basal Stem R...Waqas Tariq
Basal Stem Rot is a diseases that caused by Ganoderma Boinense that is the most serious disease for oil palm trees in Malaysia. The analysis of plant disease has been carried extensively with the advancement in computer technology. Particularly, in terms of spatial and temporal, it is very complicated to be processed. Furthermore, the application of GIS in plant disease analysis is becoming more popular, precise and advance. In previous studies, Kriging has been used to predict the pattern of BSR disease. In this study, two commonly used interpolation methods for GIS, Kriging and Inverse Distance Weighting (IDW), are used to interpolate and predict the pattern of Basal Stem Rot disease. Since the IDW method is an exact method and is more accurate one, it was expected to see more accurate results. However, the accuracy results of both methods are the same. Based on the characteristic of both methods and according to advantages and disadvantages, the Inverse Distance Weighted is recommended in this study but, for more informative data, Ordinary Kriging is suggested to be the preferable method to be used as an alternative method. .
On the use of voice activity detection in speech emotion recognitionjournalBEEI
Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
Comparative study to realize an automatic speaker recognition system IJECEIAES
In this research, we present an automatic speaker recognition system based on adaptive orthogonal transformations. To obtain the informative features with a minimum dimension from the input signals, we created an adaptive operator, which helped to identify the speaker’s voice in a fast and efficient manner. We test the efficiency and the performance of our method by comparing it with another approach, mel-frequency cepstral coefficients (MFCCs), which is widely used by researchers as their feature extraction method. The experimental results show the importance of creating the adaptive operator, which gives added value to the proposed approach. The performance of the system achieved 96.8% accuracy using Fourier transform as a compression method and 98.1% using Correlation as a compression method.
Speech signal compression and encryption based on sudoku, fuzzy C-means and t...IJECEIAES
Compression and encryption of speech signals are essential multimedia technologies. In the field of speech, these technologies are needed to meet the security and confidentiality of information requirements for transferring huge speech signals via a network, and for decreasing storage space for rapid retrieval. In this paper, we propose an algorithm that includes hybrid transformation in order to analyses the speech signal frequencies. The speech signal is then compressed, after removing low and less intense frequencies, to produce a well compressed speech signal and ensure the quality of the speech. The resulting compressed speech is then used as an input in a scrambling algorithm that was proposed on two levels. One of these is an external scramble that works on mixing up the segments of speech that were divided using Fuzzy C-Means and changing their locations. The internal scramble scatters the values of each block internally based on the pattern of a Sudoku puzzle and quadratic map so that the resulting speech is an input to a proposed encryption algorithm using the threefish algorithm. The proposed algorithm proved to be highly efficient in the compression and encryption of the speech signal based on approved statistical measures.
Intelligent Arabic letters speech recognition system based on mel frequency c...IJECEIAES
Speech recognition is one of the important applications of artificial intelligence (AI). Speech recognition aims to recognize spoken words regardless of who is speaking to them. The process of voice recognition involves extracting meaningful features from spoken words and then classifying these features into their classes. This paper presents a neural network classification system for Arabic letters. The paper will study the effect of changing the multi-layer perceptron (MLP) artificial neural network (ANN) properties to obtain an optimized performance. The proposed system consists of two main stages; first, the recorded spoken letters are transformed from the time domain into the frequency domain using fast Fourier transform (FFT), and features are extracted using mel frequency cepstral coefficients (MFCC). Second, the extracted features are then classified using the MLP ANN with back-propagation (BP) learning algorithm. The obtained results show that the proposed system along with the extracted features can classify Arabic spoken letters using two neural network hidden layers with an accuracy of around 86%.
Comparison of Interpolation Methods in Prediction the Pattern of Basal Stem R...Waqas Tariq
Basal Stem Rot is a diseases that caused by Ganoderma Boinense that is the most serious disease for oil palm trees in Malaysia. The analysis of plant disease has been carried extensively with the advancement in computer technology. Particularly, in terms of spatial and temporal, it is very complicated to be processed. Furthermore, the application of GIS in plant disease analysis is becoming more popular, precise and advance. In previous studies, Kriging has been used to predict the pattern of BSR disease. In this study, two commonly used interpolation methods for GIS, Kriging and Inverse Distance Weighting (IDW), are used to interpolate and predict the pattern of Basal Stem Rot disease. Since the IDW method is an exact method and is more accurate one, it was expected to see more accurate results. However, the accuracy results of both methods are the same. Based on the characteristic of both methods and according to advantages and disadvantages, the Inverse Distance Weighted is recommended in this study but, for more informative data, Ordinary Kriging is suggested to be the preferable method to be used as an alternative method. .
On the use of voice activity detection in speech emotion recognitionjournalBEEI
Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
Comparative study to realize an automatic speaker recognition system IJECEIAES
In this research, we present an automatic speaker recognition system based on adaptive orthogonal transformations. To obtain the informative features with a minimum dimension from the input signals, we created an adaptive operator, which helped to identify the speaker’s voice in a fast and efficient manner. We test the efficiency and the performance of our method by comparing it with another approach, mel-frequency cepstral coefficients (MFCCs), which is widely used by researchers as their feature extraction method. The experimental results show the importance of creating the adaptive operator, which gives added value to the proposed approach. The performance of the system achieved 96.8% accuracy using Fourier transform as a compression method and 98.1% using Correlation as a compression method.
Speech signal compression and encryption based on sudoku, fuzzy C-means and t...IJECEIAES
Compression and encryption of speech signals are essential multimedia technologies. In the field of speech, these technologies are needed to meet the security and confidentiality of information requirements for transferring huge speech signals via a network, and for decreasing storage space for rapid retrieval. In this paper, we propose an algorithm that includes hybrid transformation in order to analyses the speech signal frequencies. The speech signal is then compressed, after removing low and less intense frequencies, to produce a well compressed speech signal and ensure the quality of the speech. The resulting compressed speech is then used as an input in a scrambling algorithm that was proposed on two levels. One of these is an external scramble that works on mixing up the segments of speech that were divided using Fuzzy C-Means and changing their locations. The internal scramble scatters the values of each block internally based on the pattern of a Sudoku puzzle and quadratic map so that the resulting speech is an input to a proposed encryption algorithm using the threefish algorithm. The proposed algorithm proved to be highly efficient in the compression and encryption of the speech signal based on approved statistical measures.
Intelligent Arabic letters speech recognition system based on mel frequency c...IJECEIAES
Speech recognition is one of the important applications of artificial intelligence (AI). Speech recognition aims to recognize spoken words regardless of who is speaking to them. The process of voice recognition involves extracting meaningful features from spoken words and then classifying these features into their classes. This paper presents a neural network classification system for Arabic letters. The paper will study the effect of changing the multi-layer perceptron (MLP) artificial neural network (ANN) properties to obtain an optimized performance. The proposed system consists of two main stages; first, the recorded spoken letters are transformed from the time domain into the frequency domain using fast Fourier transform (FFT), and features are extracted using mel frequency cepstral coefficients (MFCC). Second, the extracted features are then classified using the MLP ANN with back-propagation (BP) learning algorithm. The obtained results show that the proposed system along with the extracted features can classify Arabic spoken letters using two neural network hidden layers with an accuracy of around 86%.
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
Speaker identification is one of the most important technologies nowadays. Many fields such as
bioinformatics and security are using speaker identification. Also, almost all electronic devices are using
this technology too. Based on number of text, speaker identification divided into text dependent and text
independent. On many fields, text independent is mostly used because number of text is unlimited. So, text
independent is generally more challenging than text dependent. In this research, speaker identification text
independent with Indonesian speaker data was modelled with Vector Quantization (VQ). In this research
VQ with K-Means initialization was used. K-Means clustering also was used to initialize mean and
Hierarchical Agglomerative Clustering was used to identify K value for VQ. The best VQ accuracy was
59.67% when k was 5. According to the result, Indonesian language could be modelled by VQ. This
research can be developed using optimization method for VQ parameters such as Genetic Algorithm or
Particle Swarm Optimization.
Virtual private networks (VPN) provide remotely secure connection for clients to exchange information with company networks. This paper deals with Site-to-site IPsec-VPN that connects the company intranets. IPsec-VPN network is implemented with security protocols for key management and exchange, authentication and integrity using GNS3 Network simulator. The testing and verification analyzing of data packets is done using both PING tool and Wireshark to ensure the encryption of data packets during data exchange between different sites belong to the same company.
The performance of an algorithm can be improved using a parallel computing programming approach. In this study, the performance of bubble sort algorithm on various computer specifications has been applied. Experimental results have shown that parallel computing programming can save significant time performance by 61%-65% compared to serial computing programming.
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Waqas Tariq
From the existing research it has been observed that many techniques and methodologies are available for performing every step of Automatic Speech Recognition (ASR) system, but the performance (Minimization of Word Error Recognition-WER and Maximization of Word Accuracy Rate- WAR) of the methodology is not dependent on the only technique applied in that method. The research work indicates that, performance mainly depends on the category of the noise, the level of the noise and the variable size of the window, frame, frame overlap etc is considered in the existing methods. The main aim of the work presented in this paper is to use variable size of parameters like window size, frame size and frame overlap percentage to observe the performance of algorithms for various categories of noise with different levels and also train the system for all size of parameters and category of real world noisy environment to improve the performance of the speech recognition system. This paper presents the results of Signal-to-Noise Ratio (SNR) and Accuracy test by applying variable size of parameters. It is observed that, it is really very hard to evaluate test results and decide parameter size for ASR performance improvement for its resultant optimization. Hence, this study further suggests the feasible and optimum parameter size using Fuzzy Inference System (FIS) for enhancing resultant accuracy in adverse real world noisy environmental conditions. This work will be helpful to give discriminative training of ubiquitous ASR system for better Human Computer Interaction (HCI). Keywords: ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction (HCI)
Isolated Word Recognition System For Tamil Spoken Language Using Back Propaga...CSEIJJournal
Speech recognition has been an active research topic for more than 50 years. Interacting with the
computer through speech is one of the active scientific research fields particularly for the disable
community who face variety of difficulties to use the computer. Such research in Automatic Speech
Recognition (ASR) is investigated for different languages because each language has its specific features.
Especially the need for ASR system in Tamil language has been increased widely in the last few years. In
this paper, a speech recognition system for individually spoken word in Tamil language using multilayer
feed forward network is presented. To implement the above system, initially the input signal is
preprocessed using four types of filters namely preemphasis, median, average and Butterworth bandstop
filter in order to remove the background noise and to enhance the signal. The performance of these filters
are measured based on MSE and PSNR values. The best filtered signal is taken as the input for the further
process of ASR system
Comparison of signal smoothing techniques for use in embedded system for moni...Dalton Valadares
Paper about the comparison between some signal smoothing techniques for use in an embedded system responsible for monitoring the biofuels quality, specificaly the oxidative stability.
Machine Translation is an emerging field of Computer Science. Researchers have been done to make Machine Translation systems for different language pairs using different practices including rule based machine translation and Statistical Machine Translation (SMT). The goal of the project is to design a Statistical Machine translator for software language localization using Moses decoder. The system is expected to automatically localize (translate) software contents from English into Tamil by using Statistical Machine Translation.
Comparative Study of Different Techniques in Speaker Recognition: ReviewIJAEMSJORNAL
The speech is most basic and essential method of communication used by person.On the basis of individual information included in speech signals the speaker is recognized. Speaker recognition (SR) is useful to identify the person who is speaking. In recent years speaker recognition is used for security system. In this paper we have discussed the feature extraction techniques like Mel frequency cepstral coefficient (MFCC), Linear predictive coding (LPC), Dynamic time wrapping (DTW), and for classification Gaussian Mixture Models (GMM), Artificial neural network (ANN)& Support vector machine (SVM).
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm.
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...TELKOMNIKA JOURNAL
Speech recognition can be defined as the process of converting voice signals into the ranks of the
word, by applying a specific algorithm that is implemented in a computer program. The research of speech
recognition in Indonesia is relatively limited. This paper has studied methods of feature extraction which is
the best among the Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCC) for
speech recognition in Indonesian language. This is important because the method can produce a high
accuracy for a particular language does not necessarily produce the same accuracy for other languages,
considering every language has different characteristics. Thus this research hopefully can help further
accelerate the use of automatic speech recognition for Indonesian language. There are two main
processes in speech recognition, feature extraction and recognition. The method used for comparison
feature extraction in this study is the LPC and MFCC, while the method of recognition using Hidden
Markov Model (HMM). The test results showed that the MFCC method is better than LPC in Indonesian
language speech recognition.
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITIONcscpconf
Emotional state recognition through speech is being a very interesting research topic nowadays.
Using subliminal information of speech, it is possible to recognize the emotional state of the
person. One of the main problems in the design of automatic emotion recognition systems is the
small number of available patterns. This fact makes the learning process more difficult, due to
the generalization problems that arise under these conditions.
In this work we propose a solution to this problem consisting in enlarging the training set
through the creation the new virtual patterns. In the case of emotional speech, most of the
emotional information is included in speed and pitch variations. So, a change in the average
pitch that does not modify neither the speed nor the pitch variations does not affect the
expressed emotion. Thus, we use this prior information in order to create new patterns applying
a pitch shift modification in the feature extraction process of the classification system. For this
purpose, we propose a frequency scaling modification of the Mel Frequency Cepstral
Coefficients, used to classify the emotion. This proposed process allows us to synthetically
increase the number of available patterns in thetraining set, thus increasing the generalization
capability of the system and reducing the test error.
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
Speaker identification is one of the most important technologies nowadays. Many fields such as
bioinformatics and security are using speaker identification. Also, almost all electronic devices are using
this technology too. Based on number of text, speaker identification divided into text dependent and text
independent. On many fields, text independent is mostly used because number of text is unlimited. So, text
independent is generally more challenging than text dependent. In this research, speaker identification text
independent with Indonesian speaker data was modelled with Vector Quantization (VQ). In this research
VQ with K-Means initialization was used. K-Means clustering also was used to initialize mean and
Hierarchical Agglomerative Clustering was used to identify K value for VQ. The best VQ accuracy was
59.67% when k was 5. According to the result, Indonesian language could be modelled by VQ. This
research can be developed using optimization method for VQ parameters such as Genetic Algorithm or
Particle Swarm Optimization.
Virtual private networks (VPN) provide remotely secure connection for clients to exchange information with company networks. This paper deals with Site-to-site IPsec-VPN that connects the company intranets. IPsec-VPN network is implemented with security protocols for key management and exchange, authentication and integrity using GNS3 Network simulator. The testing and verification analyzing of data packets is done using both PING tool and Wireshark to ensure the encryption of data packets during data exchange between different sites belong to the same company.
The performance of an algorithm can be improved using a parallel computing programming approach. In this study, the performance of bubble sort algorithm on various computer specifications has been applied. Experimental results have shown that parallel computing programming can save significant time performance by 61%-65% compared to serial computing programming.
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Waqas Tariq
From the existing research it has been observed that many techniques and methodologies are available for performing every step of Automatic Speech Recognition (ASR) system, but the performance (Minimization of Word Error Recognition-WER and Maximization of Word Accuracy Rate- WAR) of the methodology is not dependent on the only technique applied in that method. The research work indicates that, performance mainly depends on the category of the noise, the level of the noise and the variable size of the window, frame, frame overlap etc is considered in the existing methods. The main aim of the work presented in this paper is to use variable size of parameters like window size, frame size and frame overlap percentage to observe the performance of algorithms for various categories of noise with different levels and also train the system for all size of parameters and category of real world noisy environment to improve the performance of the speech recognition system. This paper presents the results of Signal-to-Noise Ratio (SNR) and Accuracy test by applying variable size of parameters. It is observed that, it is really very hard to evaluate test results and decide parameter size for ASR performance improvement for its resultant optimization. Hence, this study further suggests the feasible and optimum parameter size using Fuzzy Inference System (FIS) for enhancing resultant accuracy in adverse real world noisy environmental conditions. This work will be helpful to give discriminative training of ubiquitous ASR system for better Human Computer Interaction (HCI). Keywords: ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction (HCI)
Isolated Word Recognition System For Tamil Spoken Language Using Back Propaga...CSEIJJournal
Speech recognition has been an active research topic for more than 50 years. Interacting with the
computer through speech is one of the active scientific research fields particularly for the disable
community who face variety of difficulties to use the computer. Such research in Automatic Speech
Recognition (ASR) is investigated for different languages because each language has its specific features.
Especially the need for ASR system in Tamil language has been increased widely in the last few years. In
this paper, a speech recognition system for individually spoken word in Tamil language using multilayer
feed forward network is presented. To implement the above system, initially the input signal is
preprocessed using four types of filters namely preemphasis, median, average and Butterworth bandstop
filter in order to remove the background noise and to enhance the signal. The performance of these filters
are measured based on MSE and PSNR values. The best filtered signal is taken as the input for the further
process of ASR system
Comparison of signal smoothing techniques for use in embedded system for moni...Dalton Valadares
Paper about the comparison between some signal smoothing techniques for use in an embedded system responsible for monitoring the biofuels quality, specificaly the oxidative stability.
Machine Translation is an emerging field of Computer Science. Researchers have been done to make Machine Translation systems for different language pairs using different practices including rule based machine translation and Statistical Machine Translation (SMT). The goal of the project is to design a Statistical Machine translator for software language localization using Moses decoder. The system is expected to automatically localize (translate) software contents from English into Tamil by using Statistical Machine Translation.
Comparative Study of Different Techniques in Speaker Recognition: ReviewIJAEMSJORNAL
The speech is most basic and essential method of communication used by person.On the basis of individual information included in speech signals the speaker is recognized. Speaker recognition (SR) is useful to identify the person who is speaking. In recent years speaker recognition is used for security system. In this paper we have discussed the feature extraction techniques like Mel frequency cepstral coefficient (MFCC), Linear predictive coding (LPC), Dynamic time wrapping (DTW), and for classification Gaussian Mixture Models (GMM), Artificial neural network (ANN)& Support vector machine (SVM).
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm.
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...TELKOMNIKA JOURNAL
Speech recognition can be defined as the process of converting voice signals into the ranks of the
word, by applying a specific algorithm that is implemented in a computer program. The research of speech
recognition in Indonesia is relatively limited. This paper has studied methods of feature extraction which is
the best among the Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCC) for
speech recognition in Indonesian language. This is important because the method can produce a high
accuracy for a particular language does not necessarily produce the same accuracy for other languages,
considering every language has different characteristics. Thus this research hopefully can help further
accelerate the use of automatic speech recognition for Indonesian language. There are two main
processes in speech recognition, feature extraction and recognition. The method used for comparison
feature extraction in this study is the LPC and MFCC, while the method of recognition using Hidden
Markov Model (HMM). The test results showed that the MFCC method is better than LPC in Indonesian
language speech recognition.
SYNTHETICAL ENLARGEMENT OF MFCC BASED TRAINING SETS FOR EMOTION RECOGNITIONcscpconf
Emotional state recognition through speech is being a very interesting research topic nowadays.
Using subliminal information of speech, it is possible to recognize the emotional state of the
person. One of the main problems in the design of automatic emotion recognition systems is the
small number of available patterns. This fact makes the learning process more difficult, due to
the generalization problems that arise under these conditions.
In this work we propose a solution to this problem consisting in enlarging the training set
through the creation the new virtual patterns. In the case of emotional speech, most of the
emotional information is included in speed and pitch variations. So, a change in the average
pitch that does not modify neither the speed nor the pitch variations does not affect the
expressed emotion. Thus, we use this prior information in order to create new patterns applying
a pitch shift modification in the feature extraction process of the classification system. For this
purpose, we propose a frequency scaling modification of the Mel Frequency Cepstral
Coefficients, used to classify the emotion. This proposed process allows us to synthetically
increase the number of available patterns in thetraining set, thus increasing the generalization
capability of the system and reducing the test error.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
Analysis of speech signal mlbp features
1. International Journal of Latest Research in Engineering and Technology (IJLRET)
ISSN: 2454-5031
www.ijlret.com || Volume 06 - Issue 04 || April 2020 || PP. 01-06
www.ijlret.com 1 | Page
Analysis of speech signal MLBP features
Prof. Ziad Alqadi, Dr. Mohammad S. Khrisat, Dr. Amjad Hindi,
Dr. Majed Omar Dwairi
Albalqa Applied University, Faculty of engineering technology, Jordan, Amman
Abstract: The digital audio signal is one of the most important types of data and the most used in
communication. It is used in many vital applications, the most important of which is digital protection systems.
And since the volume of the audio file is large, its use in carrying out the matching and verification process
requires a large amount of time, which leads to the low effectiveness of the security and protection system. So
we have to find a suitable way to represent the voice with a new number and a few values that can be used as a
sound features. In this paper we will discuss in details how to use MLBP method of features extraction, we will
show how this method is stable, flexible and efficient.
Keywords: Speech, features vector, LBP, MLBP, K_mean clustering, LPC, WPT, FIR, throughput, extraction
time, FS.
Introduction
Digital signals [1], [2] such digital audio signals (speech) [3], [4] and digital images (gray and color)
[5], [6], [7] are very important type of data because they are using in any vital applications such banking
systems, security systems and computer classification systems [4], [5]. Here in this paper we will in details
analyze the modified local binary pattern method of features extraction, which can be easily used in a human
speech classification system (HSCS) [6], [7].
The digital speech signal is a one column matrix ( mono signal), or two column matrix (stereo signal),
each column represents the samples amplitude [1], [8],[9], these samples values are obtained as a result of
converting analogue speech to digital as shown in figure 1 [10], [11], [12] by sampling (stage 1) and
quantization (stage 2).
Figure 1: Converting speech analogue signal to digital
Speech signal is an important digital data type due to the vital applications requiring this kind of data,
these applications such as security systems application [3], [4] require a high speed of implementation, but the
speech signals usually have a big size, and thus will negatively affects the system efficiency and here we will
seek a method to represent the speech by a small number of values to increase the process of speech
manipulation. Speech signal file size depends on the recording time and the sampling rate [7], [8].
The sampling frequency or sampling rate, fs, is the average number of samples obtained in one second (samples
per second), thus fs = 1/T. Table 1 shows some information about the speech signals which we will investigate
in this paper [7], [8], [9].
2. International Journal of Latest Research in Engineering and Technology (IJLRET)
ISSN: 2454-5031
www.ijlret.com || Volume 06 - Issue 04 || April 2020 || PP. 01-06
www.ijlret.com 2 | Page
Table 1: Used speech signal files
Speech
#
Spoken words Fs Recording
time(seconds)
Size(samples) Size(bytes)
1 Aqaba is a beautiful city, it is
located on the red sea
44100 5.7832 255037 2040296
2 Stay home stay safe 44100 2.8451 125469 1003752
3 Albalqa applied university 44100 3.5109 154829 1238632
4 Amman is the capital city of Jordan 44100 4.1620 183544 1468352
5 How are you 44100 1.9204 84691 677528
6 My name is Ziad 44100 2.5021 110344 882752
7 Please open the door 44100 2.5362 111848 894784
8 Please shut down the computer 44100 3.3558 147990 1183920
9 Speech signal analysis 44100 2.9507 130127 1041016
10 Good by 44100 1.6909 74569 596552
Average 3.1257 137840 1102800
From table 1 we can see that the average number of samples is big, so the average file size is also big,
and this will lead to extra time to identify the speech [26], [27], so we can represent the speech file by a
histogram [12], [13],[14], [28] of 256 values and with size equal 2048 bytes for each speech file[5], [6], [9].
To reduce the classification time [14], [15], [16].we have to use signal features instead of using the
signal [13], the features vector for each speech signal will be unique, simple, fixed and will contain a small
number of values [17], [18], [19], [20].
Existing features extraction methods
Many methods were introduced to extract features for the digital speech signal, some of these methods
were based on calculating local binary pattern (LBP) operators [21], [22],[23], [24], the method here in this
paper to be analysed is modified LBP (MLBP) method, which will be introduced later in this section.
Some methods used the concepts of data clustering such as K_mean method of clustering (KMC)
[29],[30],[31],[32], in which we can use the centroids as a features, this method is not stable in generating the
features, the features can be changed from run to run and it requires a big amount of time.
Other methods were based on wavelet packet tree decomposition (WPD) [33], [34], [35], this method is
efficient, but the number of levels required for decomposition varies for speech to another, especially when the
speech file size is not fixed.
Other methods were based on using finite impulse response filter (FIR) coefficients as features [25],
[26] based on linear prediction coding, this method creates a stable features and it is efficient by providing a
small amount of features extraction time.
The speech files shown in table 1 were implemented and table 2 summarizes the total results of
implementation.
Table 2: Summery results of the used method of features extraction
Method Average extraction
time(second)
Throughput(samples
per second)
WPT 0.1466 931062
LPC 0.1052 1409429
KMC 10.92503 12494
The proposed MLBP method of speech features extraction can be implemented applying the following steps:
- Get the speech file.
- Reshape the speech file from one-column (mono speech) or two-column (stereo speech) matrix to one
row matrix.
- Initialize the features vector to zeros (4 elements vector).
- For each sample in the one row matrix apply the steps shown in table 3
3. International Journal of Latest Research in Engineering and Technology (IJLRET)
ISSN: 2454-5031
www.ijlret.com || Volume 06 - Issue 04 || April 2020 || PP. 01-06
www.ijlret.com 3 | Page
Table 3: MLBP calculation
Samples …. S(i-2) S(i-1) S(i) S(i-1) S(i-2) …….
Values … -0.5 1 0 -0.8 1 …
<= <=
Binary 1 0
Decimal 2
So add 1 to the vector with index=2
Implementation and results discussion
Experiment 1: Varying sampling frequency for the same speech signal
Here we took the spoken words "ziad alqadi" and recorded the speeches using various sampling rate
(frequency), table 4 shows the results of implementation:
Table 4: Experiment 1 results
FS Features Samples ET(seconds)
8000 5312 539 435 12041 18331 0.0010
11025 7318 689 606 16645 25262 0.0020
12000 8079 646 521 18246 27496 0.0021
16000 10743 775 667 24473 36662 0.0023
22050 14819 1039 871 33792 50525 0.0026
24000 16387 819 678 37105 54993 0.0030
32000 20990 1830 1742 48758 73324 0.0034
44100 22272 9088 460 69215 101039 0.0040
48000 22732 11407 0 75832 109975 0.0060
Average 55290 0.0029
From table 4 we can see the following:
- Changing FS leads to changing the speech features.
- The method is efficient by providing an average extraction time 0.0029 seconds with a throughput of
19066000 samples per second.
- Number of samples increases when FS increases.
- Extraction time increases when FS increases (see figure 2)
Figure 2: Relationship between FS, number of samples and extraction time
Experiment 2: Varying sampling frequency for the same speech signal
In this experiment we took different spoken words by the same person fixing FS to 44100, the features of the
speech file were extracted, table 5 shows the results of this experiment.
0 2 4 6
x 10
4
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
x 10
-3
FS
Extractiontime(second)
0 2 4 6
x 10
4
1
2
3
4
5
6
7
8
9
10
11
x 10
4
FS
Samples
4. International Journal of Latest Research in Engineering and Technology (IJLRET)
ISSN: 2454-5031
www.ijlret.com || Volume 06 - Issue 04 || April 2020 || PP. 01-06
www.ijlret.com 4 | Page
Table 5: Experiment 2 results
Spoken
word
Features Samples ET(second)
One 6216 2491 88 50419 59218 0.0020
Two 9642 3985 200 55410 69241 0.0030
Three 6619 2814 181 49930 59548 0.0023
Four 4217 1809 137 47030 53197 0.0017
Five 8549 3578 243 51395 63769 0.0027
Sex 5944 2917 446 50284 59595 0.0024
Seven 8904 3746 256 51928 64838 0.0029
Eight 5435 2240 121 48779 56579 0.0024
Nine 8677 3412 101 50653 62847 0.0025
Ten 8361 3494 181 50876 62916 0.0026
Average 61175 0.0024
From table 5 we can see the following:
- The method provides a high speed extraction process.
- The features for each spoken word are unique, thus we can identify the person and the spoken word.
- The results show that we can use this method to handle the pass-wording application system.
Experiment 3: The same words spoken by different persons
Here we took the spoken sentence "My name is ziad alqadi", table 6 shows the results of this experiment
Table 6: Experiment 3 results
Person Features FS Samples ET
1 6982 1631 1578 7277 11025 17472 0.0010
2 14068 84 100 14217 44100 28473 0.0017
3 23518 5115 3367 32313 11025 64317 0.0028
4 18220 5351 5241 18072 11025 46888 0.0021
5 30916 6815 7011 30980 11025 75726 0.0032
From table 6 we can see the following:
- For each speech the extracted features were unique.
- The extraction process was repeated several time and the features remain the same.
- The method can be considered stable and efficient
The speeches shown in table 1 were treated using MLBP method and the average extraction time was equal
0.0069 seconds, table 7 shows the summer results comparing with other methods results:
Table 7: Speed up of MLBP method
Method Average extraction
time(second)
Throughput(samples
per second)
Speed up of MLBP
WPT 0.1466 931062 21.2
LPC 0.1052 1409429 15.2
KMC 10.92503 12494 1583.3
MLBP 0.0069 19782000 1
From table 7 we can see that MLBP method is the most efficient method by providing minimum
extraction time and maximum throughput.
Conclusion
MLBP method of speech features extraction was introduced, tested and implemented; the experimental
results showed that the introduced method is very efficient comparing with other methods, the extracted features
for each speech file were stable and unique, the features can be easily used to identify the person and to identify
the spoken words by a certain person.
5. International Journal of Latest Research in Engineering and Technology (IJLRET)
ISSN: 2454-5031
www.ijlret.com || Volume 06 - Issue 04 || April 2020 || PP. 01-06
www.ijlret.com 5 | Page
References
[1]. Ziad Alqadi, Bilal Zahran, Jihad Nader, Estimation and Tuning of FIR Lowpass Digital Filter
Parameters, International Journal of Advanced Research in Computer Science and Software
Engineering, vol. 7, issue 2, pp. 18-23, 2017.
[2]. Haitham Alasha'ary, Abdullah Al-Hasanat, Khaled Matrouk, Ziad Al-Qadi, Hasan Al-Shalabi, A Novel
Digital Filter for Enhancing Dark Gray Images, European Journal of Scientific Research , pp. 99-106,
2014.
[3]. Majed O Al-Dwairi, Ziad A Alqadi, Amjad A Abujazar, Rushdi Abu Zneit, Optimized true-color image
processing, World Applied Sciences Journal, vol. 8, issue 10, pp. 1175-1182, 2010.
[4]. Jamil Al Azzeh, Hussein Alhatamleh, Ziad A Alqadi, Mohammad Khalil Abuzalata, Creating a Color
Map to be used to Convert a Gray Image to Color Image, International Journal of Computer
Applications, vol. 153, issue 2, pp. 31-34, 2016.
[5]. Musbah J. Aqel, Ziad A. Alqadi, Ibraheim M. El Emary, Analysis of Stream Cipher Security
Algorithm, Journal of Information and Computing Science, vol. 2, issue 4, pp. 288-298, 2007.
[6]. Jamil Al-Azzeh, Ziad Alqadi, Mohammed Abuzalata, Performance Analysis of Artificial Neural
Networks used for Color Image Recognition and Retrieving, International Journal of Computer Science
and Mobile Computing, vol. 8, issue 2, pp. 20 – 33, 2019.
[7]. Akram A. Moustafa and Ziad A. Alqadi, Color Image Reconstruction Using A New R'G'I Model,
Journal of Computer Science, vol. 5, issue 4, pp. 250-254, 2009.
[8]. Dr. Ziad Alqadi, Akram Mustafa, Majed Alduari, Rushdi Abu Zneit, True color image enhancement
using morphological operations, International review on computer and software, vol. 4, issue 5, pp.
557-562, 2009.
[9]. Mohammed Ashraf Al Zudool, Saleh Khawatreh, Ziad A. Alqadi, Efficient Methods used to Extract
Color Image Features, IJCSMC, vol. 6, issue 12, pp. 7-14, 2017.
[10]. Ayman Al-Rawashdeh, Ziad Al-Qadi, Using wave equation to extract digital signal features,
Engineering, Technology & Applied Science Research, vol. 8, issue 4, pp. 1356-1359, 2018.
[11]. AlQaisi Aws, AlTarawneh Mokhled, A Alqadi Ziad, A Sharadqah Ahmad, Analysis of Color Image
Features Extraction using Texture Methods, TELKOMNIKA, vol. 17, issue 3, 2018.
[12]. Dr Ziad A AlQadi, Hussein M Elsayyed, Window Averaging Method to Create a Feature Victor for
RGB Color Image, International Journal of Computer Science and Mobile Computing, vol. 6, issue 2,
pp. 60-66, 2017.
[13]. Dr. Ghazi. M. Qaryouti, Prof. Ziad A.A. Alqadi, Prof. Mohammed K. Abu Zalata, A Novel Method for
Color Image Recognition, IJCSMC, Vol. 5, Issue. 11, pp.57 – 64, 2016.
[14]. Jihad Nader Ismail Shayeb, Ziad Alqadi, Analysis of digital voice features extraction methods,
International Journal of Educational Research and Development, vol 4, issue 1, pp. 49-55, 2019.
[15]. Ahmad Sharadqh Naseem Asad, Ismail Shayeb, Qazem Jaber, Belal Ayyoub, Ziad Alqadi, Creating a
Stable and Fixed Features Array for Digital Color Image, IJCSMC, vol. 8, issue 8, pp. 50-56, 2019.
[16]. Majed O. Al-Dwairi, Amjad Y. Hendi, Mohamed S. Soliman, Ziad A.A. Alqadi, A new method for
voice signal features creation, International Journal of Electrical and Computer Engineering (IJECE),
vol. 9, issue 5, pp. 4092-4098, 2019.
[17]. Dr. Amjad Hindi Dr. Majed Omar Dwairi Prof. Ziad Alqadi, PROCEDURES FOR SPEECH
RECOGNITION USING LPC AND ANN, International Journal of Engineering Technology Research
& Management, vol. 4, issue 2, pp. 48-55, 2020.
[18]. Yousf Eltous Ziad A. AlQadi, Ghazi M. Qaryouti, Mohammad Abuzalata, ANALYSIS OF DIGITAL
SIGNAL FEATURES EXTRACTION BASED ON KMEANS CLUSTERING, International Journal
of Engineering Technology Research & Management, vol. 4, issue 1, pp. 66-75, 2020.
[19]. Prof. Yousif Eltous, Dr. Ghazi M. Qaryouti, Prof. Mohammad Abuzalata, Prof. Ziad Alqadi,
Evaluation of Fuzzy and C_mean Clustering Methods used to Generate Voiceprint, IJCSMC, vol. 9,
issue 1, pp. 75 -83, 2020.
[20]. Ahmad Sharadqh Naseem Asad, Ismail Shayeb, Qazem Jaber, Belal Ayyoub, Ziad Alqadi, Creating a
Stable and Fixed Features Array for Digital Color Image, IJCSMC, vol. 8, issue 8, pp. 50-56, 2019.
[21]. Ismail Shayeb, Ziad Alqadi, Jihad Nader, Analysis of digital voice features extraction methods,
International Journal of Educational Research and Development, vol. 1, issue 4, pp. 49-55, 2019.
[22]. Aws Al-Qaisi, Saleh A Khawatreh, Ahmad A Sharadqah, Ziad A Alqadi, Wave File Features
Extraction Using Reduced LBP, International Journal of Electrical and Computer Engineering, vol. 8,
issue 5, pp. 2780, 2018.
6. International Journal of Latest Research in Engineering and Technology (IJLRET)
ISSN: 2454-5031
www.ijlret.com || Volume 06 - Issue 04 || April 2020 || PP. 01-06
www.ijlret.com 6 | Page
[23]. Ziad Alqad, Prof. Yousf Eltous Dr. Ghazi M. Qaryouti, Prof. Mohammad Abuzalata, Analysis of
Digital Signal Features Extraction Based on LBP Operator, International Journal of Advanced
Research in Computer and Communication Engineering, vol. 9, issue 1, pp. 1-7, 2020.
[24]. Ziad Alqadi, Aws Al-Qaisi, Adnan Manasreh, Ahmad Sharadqeh, Digital Color Image Classification
Based on Modified Local Binary Pattern Using Neural Network, IRECAP, vol. 9, issue 6, pp. 403-408,
2019.
[25]. Dr. Amjad Hindi Prof. Yousif Eltous Prof. Mohammad Abuzalata Prof. Ziad Alqadi Dr. Ghazi M.
Qaryouti, USING FIR FILTER COEFFICIENTS TO CREATE COLOR IMAGE FEATURES,
International Journal of Engineering Technology Research & Management, vol. 4, issue 2, pp. 6-14,
2020.
[26]. Ziad Alqadi, Bilal Zahran, Jihad Nader, Estimation and Tuning of FIR Lowpass Digital Filter
Parameters, International Journal of Advanced Research in Computer Science and Software
Engineering, vol. 7, issue 2, pp. 18-23, 2017.
[27]. Prof. Yousif Eltous Dr. Amjad Hindi, Prof. Ziad Alqadi, Dr. Ghazi M. Qaryouti, Prof. Mohammad
Abuzalata, Using FIR Coefficients to Form a Voiceprint, International Journal of Innovative Research
in Electrical, Electronics, Instrumentation and Control Engineering, vol. 8, issue 1, pp. 1-6, 2020.
[28]. Dr. Amjad Hindi Dr. Majed Omar Dwairi Prof. Ziad Alqadi, PROCEDURES FOR SPEECH
RECOGNITION USING LPC AND ANN, International Journal of Engineering Technology Research
& Management, vol. 4, issue 2, pp. 48-55, 2020.
[29]. Yousf Eltous Ziad A. AlQadi, Ghazi M. Qaryouti, Mohammad Abuzalata, ANALYSIS OF DIGITAL
SIGNAL FEATURES EXTRACTION BASED ON KMEANS CLUSTERING, International Journal
of Engineering Technology Research & Management, vol. 4, issue 1, pp. 66-75, 2020.
[30]. Prof. Yousif Eltous, Dr. Ghazi M. Qaryouti, Prof. Mohammad Abuzalata, Prof. Ziad Alqadi,
Evaluation of Fuzzy and C_mean Clustering Methods used to Generate Voiceprint, IJCSMC, vol. 9,
issue 1, pp. 75 -83, 2020.
[31]. Ahmad Sharadqh Naseem Asad, Ismail Shayeb, Qazem Jaber, Belal Ayyoub, Ziad Alqadi, Creating a
Stable and Fixed Features Array for Digital Color Image, IJCSMC, vol. 8, issue 8, pp. 50-56, 2019.
[32]. Ahmad Sharadqh Jamil Al-Azzeh, Rashad Rasras , Ziad Alqadi , Belal Ayyoub, Adaptation of matlab
K-means clustering function to create Color Image Features, International Journal of Research in
Advanced Engineering and Technology, vol. 5, issue 2, pp. 10-18, 2019.
[33]. Dr. Ghazi M. Qaryouti, Prof. Mohammad Abuzalata, Prof. Yousf Eltous, Prof. Ziad Alqadi,
Comparative Study of Voice Signal Features Extraction Methods, IOSR Journal of Computer
Engineering (IOSR-JCE), vol. 22, issue 1, pp. 58-66, 2020.
[34]. Amjad Y. Hindi, Majed O. Dwairi, Ziad A. AlQadi, Analysis of Digital Signals using Wavelet Packet
Tree, IJCSMC, vol. 9, issue 2, pp. 96-103, 2020.
[35]. Amjad Y. Hindi, Majed O. Dwairi, Ziad A. AlQadi, Creating Human Speech Identifier using WPT,
International Journal of Computer Science and Mobile Computing, vol. 9, issue 2, pp. 117 – 123, 2020.