IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...ijnlc
This study investigates the effectiveness of Knowledge Named Entity Recognition in Online Judges (OJs). OJs are lacking in the classification of topics and limited to the IDs only. Therefore a lot of time is consumed in finding programming problems more specifically in knowledge entities.A Bidirectional Long Short-Term Memory (BiLSTM) with Conditional Random Fields (CRF) model is applied for the recognition of knowledge named entities existing in the solution reports.For the test run, more than 2000 solution reports are crawled from the Online Judges and processed for the model output. The stability of the model is
also assessed with the higher F1 value. The results obtained through the proposed BiLSTM-CRF model are more effectual (F1: 98.96%) and efficient in lead-time.
Sentiment Analysis In Myanmar Language Using Convolutional Lstm Neural Networkkevig
In recent years, there has been an increasing use of social media among people in Myanmar and writing
review on social media pages about the product, movie, and trip are also popular among people. Moreover,
most of the people are going to find the review pages about the product they want to buy before deciding
whether they should buy it or not. Extracting and receiving useful reviews over interesting products is very
important and time consuming for people. Sentiment analysis is one of the important processes for extracting
useful reviews of the products. In this paper, the Convolutional LSTM neural network architecture is
proposed to analyse the sentiment classification of cosmetic reviews written in Myanmar Language. The
paper also intends to build the cosmetic reviews dataset for deep learning and sentiment lexicon in Myanmar
Language.
Chunking means splitting the sentences into tokens and then grouping them in a meaningful way. When it comes to high-performance chunking systems, transformer models have proved to be the state of the art benchmarks. To perform chunking as a task it requires a large-scale high quality annotated corpus where each token is attached with a particular tag similar as that of Named Entity Recognition Tasks. Later these tags are used in conjunction with pointer frameworks to find the final chunk. To solve this for a specific domain problem, it becomes a highly costly affair in terms of time and resources to manually annotate and produce a large-high-quality training set. When the domain is specific and diverse, then cold starting becomes even more difficult because of the expected large number of manually annotated queries to cover all aspects. To overcome the problem, we applied a grammar-based text generation mechanism where instead of annotating a sentence we annotate using grammar templates. We defined various templates corresponding to different grammar rules. To create a sentence we used these templates along with the rules where symbol or terminal values were chosen from the domain data catalog. It helped us to create a large number of annotated queries. These annotated queries were used for training the machine learning model using an ensemble transformer-based deep neural network model [24.] We found that grammar-based annotation was useful to solve domain-based chunks in input query sentences without any manual annotation where it was found to achieve a classification F1 score of 96.97% in classifying the tokens for the out of template queries.
Transformer Models have taken over most of the Natural language Inference tasks. In recent
times they have proved to beat several benchmarks. Chunking means splitting the sentences into
tokens and then grouping them in a meaningful way. Chunking is a task that has gradually
moved from POS tag-based statistical models to neural nets using Language models such as
LSTM, Bidirectional LSTMs, attention models, etc. Deep neural net Models are deployed
indirectly for classifying tokens as different tags defined under Named Recognition Tasks. Later
these tags are used in conjunction with pointer frameworks for the final chunking task. In our
paper, we propose an Ensemble Model using a fine-tuned Transformer Model and a recurrent
neural network model together to predict tags and chunk substructures of a sentence. We
analyzed the shortcomings of the transformer models in predicting different tags and then
trained the BILSTM+CNN accordingly to compensate for the same.
This paper presents machine translation based on machine learning, which learns the semantically
correct corpus. The machine learning process based on Quantum Neural Network (QNN) is used to
recognizing the corpus pattern in realistic way. It translates on the basis of knowledge gained during
learning by entering pair of sentences from source to target language. By taking help of this training data
it translates the given text. own text.The paper consist study of a machine translation system which
converts source language to target language using quantum neural network. Rather than comparing
words semantically QNN compares numerical tags which is faster and accurate. In this tagger tags the
part of sentences discretely which helps to map bilingual sentences.
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...ijnlc
This study investigates the effectiveness of Knowledge Named Entity Recognition in Online Judges (OJs). OJs are lacking in the classification of topics and limited to the IDs only. Therefore a lot of time is consumed in finding programming problems more specifically in knowledge entities.A Bidirectional Long Short-Term Memory (BiLSTM) with Conditional Random Fields (CRF) model is applied for the recognition of knowledge named entities existing in the solution reports.For the test run, more than 2000 solution reports are crawled from the Online Judges and processed for the model output. The stability of the model is
also assessed with the higher F1 value. The results obtained through the proposed BiLSTM-CRF model are more effectual (F1: 98.96%) and efficient in lead-time.
Sentiment Analysis In Myanmar Language Using Convolutional Lstm Neural Networkkevig
In recent years, there has been an increasing use of social media among people in Myanmar and writing
review on social media pages about the product, movie, and trip are also popular among people. Moreover,
most of the people are going to find the review pages about the product they want to buy before deciding
whether they should buy it or not. Extracting and receiving useful reviews over interesting products is very
important and time consuming for people. Sentiment analysis is one of the important processes for extracting
useful reviews of the products. In this paper, the Convolutional LSTM neural network architecture is
proposed to analyse the sentiment classification of cosmetic reviews written in Myanmar Language. The
paper also intends to build the cosmetic reviews dataset for deep learning and sentiment lexicon in Myanmar
Language.
Chunking means splitting the sentences into tokens and then grouping them in a meaningful way. When it comes to high-performance chunking systems, transformer models have proved to be the state of the art benchmarks. To perform chunking as a task it requires a large-scale high quality annotated corpus where each token is attached with a particular tag similar as that of Named Entity Recognition Tasks. Later these tags are used in conjunction with pointer frameworks to find the final chunk. To solve this for a specific domain problem, it becomes a highly costly affair in terms of time and resources to manually annotate and produce a large-high-quality training set. When the domain is specific and diverse, then cold starting becomes even more difficult because of the expected large number of manually annotated queries to cover all aspects. To overcome the problem, we applied a grammar-based text generation mechanism where instead of annotating a sentence we annotate using grammar templates. We defined various templates corresponding to different grammar rules. To create a sentence we used these templates along with the rules where symbol or terminal values were chosen from the domain data catalog. It helped us to create a large number of annotated queries. These annotated queries were used for training the machine learning model using an ensemble transformer-based deep neural network model [24.] We found that grammar-based annotation was useful to solve domain-based chunks in input query sentences without any manual annotation where it was found to achieve a classification F1 score of 96.97% in classifying the tokens for the out of template queries.
Transformer Models have taken over most of the Natural language Inference tasks. In recent
times they have proved to beat several benchmarks. Chunking means splitting the sentences into
tokens and then grouping them in a meaningful way. Chunking is a task that has gradually
moved from POS tag-based statistical models to neural nets using Language models such as
LSTM, Bidirectional LSTMs, attention models, etc. Deep neural net Models are deployed
indirectly for classifying tokens as different tags defined under Named Recognition Tasks. Later
these tags are used in conjunction with pointer frameworks for the final chunking task. In our
paper, we propose an Ensemble Model using a fine-tuned Transformer Model and a recurrent
neural network model together to predict tags and chunk substructures of a sentence. We
analyzed the shortcomings of the transformer models in predicting different tags and then
trained the BILSTM+CNN accordingly to compensate for the same.
This paper presents machine translation based on machine learning, which learns the semantically
correct corpus. The machine learning process based on Quantum Neural Network (QNN) is used to
recognizing the corpus pattern in realistic way. It translates on the basis of knowledge gained during
learning by entering pair of sentences from source to target language. By taking help of this training data
it translates the given text. own text.The paper consist study of a machine translation system which
converts source language to target language using quantum neural network. Rather than comparing
words semantically QNN compares numerical tags which is faster and accurate. In this tagger tags the
part of sentences discretely which helps to map bilingual sentences.
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELijaia
Recent development of generative pretrained language models has been proven very successful on a wide
range of NLP tasks, such as text classification, question answering, textual entailment and so on. In this
work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding
Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by
both automatic metrics and human annotators, and demonstrated that the architecture achieves the stateof-the-art comparable result on large scale corpus – ‘CNN/Daily Mail1
As the best of our knowledge’, this
is the first work that applies BERT based architecture to a text summarization task and achieved the stateof-the-art comparable result.
Extractive Summarization with Very Deep Pretrained Language Modelgerogepatton
Recent development of generative pretrained language models has been proven very successful on a wide range of NLP tasks, such as text classification, question answering, textual entailment and so on.In this work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by both automatic metrics and human annotators, and demonstrated that the architecture achieves the state-of-the-art comparable result on large scale corpus - CNN/Daily Mail1. As the best of our knowledge, this is the first work that applies BERT based architecture to a text summarization task and achieved the state-of-the-art comparable result.
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Waqas Tariq
From the existing research it has been observed that many techniques and methodologies are available for performing every step of Automatic Speech Recognition (ASR) system, but the performance (Minimization of Word Error Recognition-WER and Maximization of Word Accuracy Rate- WAR) of the methodology is not dependent on the only technique applied in that method. The research work indicates that, performance mainly depends on the category of the noise, the level of the noise and the variable size of the window, frame, frame overlap etc is considered in the existing methods. The main aim of the work presented in this paper is to use variable size of parameters like window size, frame size and frame overlap percentage to observe the performance of algorithms for various categories of noise with different levels and also train the system for all size of parameters and category of real world noisy environment to improve the performance of the speech recognition system. This paper presents the results of Signal-to-Noise Ratio (SNR) and Accuracy test by applying variable size of parameters. It is observed that, it is really very hard to evaluate test results and decide parameter size for ASR performance improvement for its resultant optimization. Hence, this study further suggests the feasible and optimum parameter size using Fuzzy Inference System (FIS) for enhancing resultant accuracy in adverse real world noisy environmental conditions. This work will be helpful to give discriminative training of ubiquitous ASR system for better Human Computer Interaction (HCI). Keywords: ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction (HCI)
Deep Learning in practice : Speech recognition and beyond - MeetupLINAGORA
Retrouvez la présentation de notre Meetup du 27 septembre 2017 présenté par notre collaborateur Abdelwahab HEBA : Deep Learning in practice : Speech recognition and beyond
Arabic named entity recognition using deep learning approachIJECEIAES
Most of the Arabic Named Entity Recognition (NER) systems depend massively on external resources and handmade feature engineering to achieve state-of-the-art results. To overcome such limitations, we proposed, in this paper, to use deep learning approach to tackle the Arabic NER task. We introduced a neural network architecture based on bidirectional Long Short-Term Memory (LSTM) and Conditional Random Fields (CRF) and experimented with various commonly used hyperparameters to assess their effect on the overall performance of our system. Our model gets two sources of information about words as input: pre-trained word embeddings and character-based representations and eliminated the need for any task-specific knowledge or feature engineering. We obtained state-of-the-art result on the standard ANERcorp corpus with an F1 score of 90.6%.
Hindi digits recognition system on speech data collected in different natural...csandit
This paper presents a baseline digits speech recognizer for Hindi language. The recording environment is different for all speakers, since the data is collected in their respective homes. The different environment refers to vehicle horn noises in some road facing rooms, internal background noises in some rooms like opening doors, silence in some rooms etc. All these recordings are used for training acoustic model. The Acoustic Model is trained on 8 speakers’ audio data. The vocabulary size of the recognizer is 10 words. HTK toolkit is used for building
acoustic model and evaluating the recognition rate of the recognizer. The efficiency of the recognizer developed on recorded data, is shown at the end of the paper and possible directions for future research work are suggested.
BERT - Part 1 Learning Notes of Senthil KumarSenthil Kumar M
In this part 1 presentation, I have attempted to provide a '30,000 feet view' of BERT (Bidirectional Encoder Representations from Transformer) - a state of the art Language Model in NLP with high level technical explanations. I have attempted to collate useful information about BERT from various useful sources.
This Part 2 presentation is a more in-depth view of BERT - Bidirectional Encoder Representations from Transformer. The source links offer more depth to the brief overview in the slides
An Improved Approach for Word Ambiguity RemovalWaqas Tariq
Word ambiguity removal is a task of removing ambiguity from a word, i.e. correct sense of word is identified from ambiguous sentences. This paper describes a model that uses Part of Speech tagger and three categories for word sense disambiguation (WSD). Human Computer Interaction is very needful to improve interactions between users and computers. For this, the Supervised and Unsupervised methods are combined. The WSD algorithm is used to find the efficient and accurate sense of a word based on domain information. The accuracy of this work is evaluated with the aim of finding best suitable domain of word. Keywords: Human Computer Interaction, Supervised Training, Unsupervised Learning, Word Ambiguity, Word sense disambiguation
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTKijistjournal
This paper describes an Hidden Markov Model-based Punjabi text-to-speech synthesis system (HTS), in which speech waveform is generated from Hidden Markov Models themselves, and applies it to Punjabi speech synthesis using the general speech synthesis architecture of HTK (HMM Tool Kit). This Hidden Markov Model based TTS can be used in mobile phones for stored phone directory or messages. Text messages and caller’s identity in English language are mapped to tokens in Punjabi language which are further concatenated to form speech with certain rules and procedures.
To build the synthesizer we recorded the speech database and phonetically segmented it, thus first extracting context-independent monophones and then context-dependent triphones. For e.g. for word bharat monophones are a, bh, t etc. & triphones are bh-a+r. These speech utterances and their phone level transcriptions (monophones and triphones) are the inputs to the speech synthesis system. System outputs the sequence of phonemes after resolving various ambiguities regarding selection of phonemes using word network files e.g. for the word Tapas the output phoneme sequence is ਤ,ਪ,ਸ instead of phoneme sequence ਟ,ਪ,ਸ .
An introduction to the Transformers architecture and BERTSuman Debnath
The transformer is one of the most popular state-of-the-art deep (SOTA) learning architectures that is mostly used for natural language processing (NLP) tasks. Ever since the advent of the transformer, it has replaced RNN and LSTM for various tasks. The transformer also created a major breakthrough in the field of NLP and also paved the way for new revolutionary architectures such as BERT.
Our project is about guessing the correct missing
word in a given sentence. To find of guess the missing word
we have two main methods one of them statistical language
modeling, while the other is neural language models.
Statistical language modeling depend on the frequency of the
relation between words and here we use Markov chain. Since
neural language models uses artificial neural networks which
uses deep learning, here we use BERT which is the state of art
in language modeling provided by google.
Intelligent location tracking scheme for handling user’s mobilityeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Genetic programming for prediction of local scour at vertical bridge abutmenteSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
An effective attack preventing routing approach in speed network in manetseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELijaia
Recent development of generative pretrained language models has been proven very successful on a wide
range of NLP tasks, such as text classification, question answering, textual entailment and so on. In this
work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding
Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by
both automatic metrics and human annotators, and demonstrated that the architecture achieves the stateof-the-art comparable result on large scale corpus – ‘CNN/Daily Mail1
As the best of our knowledge’, this
is the first work that applies BERT based architecture to a text summarization task and achieved the stateof-the-art comparable result.
Extractive Summarization with Very Deep Pretrained Language Modelgerogepatton
Recent development of generative pretrained language models has been proven very successful on a wide range of NLP tasks, such as text classification, question answering, textual entailment and so on.In this work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by both automatic metrics and human annotators, and demonstrated that the architecture achieves the state-of-the-art comparable result on large scale corpus - CNN/Daily Mail1. As the best of our knowledge, this is the first work that applies BERT based architecture to a text summarization task and achieved the state-of-the-art comparable result.
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Waqas Tariq
From the existing research it has been observed that many techniques and methodologies are available for performing every step of Automatic Speech Recognition (ASR) system, but the performance (Minimization of Word Error Recognition-WER and Maximization of Word Accuracy Rate- WAR) of the methodology is not dependent on the only technique applied in that method. The research work indicates that, performance mainly depends on the category of the noise, the level of the noise and the variable size of the window, frame, frame overlap etc is considered in the existing methods. The main aim of the work presented in this paper is to use variable size of parameters like window size, frame size and frame overlap percentage to observe the performance of algorithms for various categories of noise with different levels and also train the system for all size of parameters and category of real world noisy environment to improve the performance of the speech recognition system. This paper presents the results of Signal-to-Noise Ratio (SNR) and Accuracy test by applying variable size of parameters. It is observed that, it is really very hard to evaluate test results and decide parameter size for ASR performance improvement for its resultant optimization. Hence, this study further suggests the feasible and optimum parameter size using Fuzzy Inference System (FIS) for enhancing resultant accuracy in adverse real world noisy environmental conditions. This work will be helpful to give discriminative training of ubiquitous ASR system for better Human Computer Interaction (HCI). Keywords: ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction (HCI)
Deep Learning in practice : Speech recognition and beyond - MeetupLINAGORA
Retrouvez la présentation de notre Meetup du 27 septembre 2017 présenté par notre collaborateur Abdelwahab HEBA : Deep Learning in practice : Speech recognition and beyond
Arabic named entity recognition using deep learning approachIJECEIAES
Most of the Arabic Named Entity Recognition (NER) systems depend massively on external resources and handmade feature engineering to achieve state-of-the-art results. To overcome such limitations, we proposed, in this paper, to use deep learning approach to tackle the Arabic NER task. We introduced a neural network architecture based on bidirectional Long Short-Term Memory (LSTM) and Conditional Random Fields (CRF) and experimented with various commonly used hyperparameters to assess their effect on the overall performance of our system. Our model gets two sources of information about words as input: pre-trained word embeddings and character-based representations and eliminated the need for any task-specific knowledge or feature engineering. We obtained state-of-the-art result on the standard ANERcorp corpus with an F1 score of 90.6%.
Hindi digits recognition system on speech data collected in different natural...csandit
This paper presents a baseline digits speech recognizer for Hindi language. The recording environment is different for all speakers, since the data is collected in their respective homes. The different environment refers to vehicle horn noises in some road facing rooms, internal background noises in some rooms like opening doors, silence in some rooms etc. All these recordings are used for training acoustic model. The Acoustic Model is trained on 8 speakers’ audio data. The vocabulary size of the recognizer is 10 words. HTK toolkit is used for building
acoustic model and evaluating the recognition rate of the recognizer. The efficiency of the recognizer developed on recorded data, is shown at the end of the paper and possible directions for future research work are suggested.
BERT - Part 1 Learning Notes of Senthil KumarSenthil Kumar M
In this part 1 presentation, I have attempted to provide a '30,000 feet view' of BERT (Bidirectional Encoder Representations from Transformer) - a state of the art Language Model in NLP with high level technical explanations. I have attempted to collate useful information about BERT from various useful sources.
This Part 2 presentation is a more in-depth view of BERT - Bidirectional Encoder Representations from Transformer. The source links offer more depth to the brief overview in the slides
An Improved Approach for Word Ambiguity RemovalWaqas Tariq
Word ambiguity removal is a task of removing ambiguity from a word, i.e. correct sense of word is identified from ambiguous sentences. This paper describes a model that uses Part of Speech tagger and three categories for word sense disambiguation (WSD). Human Computer Interaction is very needful to improve interactions between users and computers. For this, the Supervised and Unsupervised methods are combined. The WSD algorithm is used to find the efficient and accurate sense of a word based on domain information. The accuracy of this work is evaluated with the aim of finding best suitable domain of word. Keywords: Human Computer Interaction, Supervised Training, Unsupervised Learning, Word Ambiguity, Word sense disambiguation
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTKijistjournal
This paper describes an Hidden Markov Model-based Punjabi text-to-speech synthesis system (HTS), in which speech waveform is generated from Hidden Markov Models themselves, and applies it to Punjabi speech synthesis using the general speech synthesis architecture of HTK (HMM Tool Kit). This Hidden Markov Model based TTS can be used in mobile phones for stored phone directory or messages. Text messages and caller’s identity in English language are mapped to tokens in Punjabi language which are further concatenated to form speech with certain rules and procedures.
To build the synthesizer we recorded the speech database and phonetically segmented it, thus first extracting context-independent monophones and then context-dependent triphones. For e.g. for word bharat monophones are a, bh, t etc. & triphones are bh-a+r. These speech utterances and their phone level transcriptions (monophones and triphones) are the inputs to the speech synthesis system. System outputs the sequence of phonemes after resolving various ambiguities regarding selection of phonemes using word network files e.g. for the word Tapas the output phoneme sequence is ਤ,ਪ,ਸ instead of phoneme sequence ਟ,ਪ,ਸ .
An introduction to the Transformers architecture and BERTSuman Debnath
The transformer is one of the most popular state-of-the-art deep (SOTA) learning architectures that is mostly used for natural language processing (NLP) tasks. Ever since the advent of the transformer, it has replaced RNN and LSTM for various tasks. The transformer also created a major breakthrough in the field of NLP and also paved the way for new revolutionary architectures such as BERT.
Our project is about guessing the correct missing
word in a given sentence. To find of guess the missing word
we have two main methods one of them statistical language
modeling, while the other is neural language models.
Statistical language modeling depend on the frequency of the
relation between words and here we use Markov chain. Since
neural language models uses artificial neural networks which
uses deep learning, here we use BERT which is the state of art
in language modeling provided by google.
Intelligent location tracking scheme for handling user’s mobilityeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Genetic programming for prediction of local scour at vertical bridge abutmenteSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
An effective attack preventing routing approach in speed network in manetseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Comparative study of chemically and mechanically singed knit fabriceSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Wear behaviour of si c reinforced al6061 alloy metal matrix composites by usi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Treatability study of cetp wastewater using physico chemical process-a case s...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Speech processing is considered as crucial and an intensive field of research in the growth of robust and efficient speech recognition system. But the accuracy for speech recognition still focuses for variation of context, speaker’s variability, and environment conditions. In this paper, we stated curvelet based Feature Extraction (CFE) method for speech recognition in noisy environment and the input speech signal is decomposed into different frequency channels using the characteristics of curvelet transform for reduce the computational complication and the feature vector size successfully and they have better accuracy, varying window size because of which they are suitable for non –stationary signals. For better word classification and recognition, discrete hidden markov model can be used and as they consider time distribution of speech signals. The HMM classification method attained the maximum accuracy in term of identification rate for informal with 80.1%, scientific phrases with 86%, and control with 63.8 % detection rates. The objective of this study is to characterize the feature extraction methods and classification phage in speech recognition system. The various approaches available for developing speech recognition system are compared along with their merits and demerits. The statistical results shows that signal recognition accuracy will be increased by using discrete Curvelet transforms over conventional methods.
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...ijcsit
Speech processing is considered as crucial and an intensive field of research in the growth of robust and efficient speech recognition system. But the accuracy for speech recognition still focuses for variation of context, speaker’s variability, and environment conditions. In this paper, we stated curvelet based Feature Extraction (CFE) method for speech recognition in noisy environment and the input speech signal is decomposed into different frequency channels using the characteristics of curvelet transform for reduce the computational complication and the feature vector size successfully and they have better accuracy, varying window size because of which they are suitable for non –stationary signals. For better word classification and recognition, discrete hidden markov model can be used and as they consider time distribution of
speech signals. The HMM classification method attained the maximum accuracy in term of identification rate for informal with 80.1%, scientific phrases with 86%, and control with 63.8 % detection rates. The objective of this study is to characterize the feature extraction methods and classification phage in speech
recognition system. The various approaches available for developing speech recognition system are compared along with their merits and demerits. The statistical results shows that signal recognition accuracy will be increased by using discrete Curvelet transforms over conventional methods.
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
Automatic speaker recognition system is used to recognize an unknown speaker among several reference speakers by making use of speaker-specific information from their speech. In this paper, we introduce a novel, hierarchical, text-independent speaker recognition. Our baseline speaker recognition system accuracy, built using statistical modeling techniques, gives an accuracy of 81% on the standard MIT database and our baseline gender recognition system gives an accuracy of 93.795%. We then propose and implement a novel state-space pruning technique by performing gender recognition before speaker recognition so as to improve the accuracy/timeliness of our baseline speaker recognition system. Based on the experiments conducted on the MIT database, we demonstrate that our proposed system improves the accuracy over the baseline system by approximately 2%, while reducing the computational time by more than 30%.
Isolated words recognition using mfcc, lpc and neural networkeSAT Journals
Abstract Automatic speech recognition is an important topic of speech processing. This paper presents the use of an Artificial Neural Network (ANN) for isolated word recognition. The Pre-processing is done and voiced speech is detected based on energy and zero crossing rates (ZCR). The proposed approach used in speech recognition is Mel Frequency Cepstral Coefficients (MFCC) and combine features of both MFCC and Linear Predictive Coding (LPC). The back-propagation is used as a classifier. The recognition accuracy is increased when combine features of both LPC and MFCC are used as compared to only MFCC approach using Neural Network as a classifier.. Keywords: Pre-processing, Mel frequency Cepstral Coefficient (MFCC), Linear Predictive Coding (LPC), Artificial Neural Network (ANN).
Design and implementation of different audio restoration techniques for audio...eSAT Journals
Abstract
Audio signals are corrupted with many types of distortions. Major audio distortions are categorized into Globalized and
Localized distortions. Localized distortion includes clipping and clicks where only certain samples are affected and globalized
distortions include broadband noise where complete bandwidth is consumed by noise. Audio restoration is a technique for giving
back the audio signals from these distortions. In this paper, audio restoration techniques for removing clipping, clicks and
broadband noise are put forwarded. Recent approaches to solving audio restoration problem is with respect to sparse
representation algorithms. Clipping distortion is addressed with a Sparse representation framework, it is treated as a reverse
problem, where the distorted samples is estimated from the surrounding undistorted samples, they are embedded in frame based
scheme, and reconstructed by using an overlap add method in conjunction with OMP algorithm and Gabor/DCT dictionary for
modelling audio signals. Broadband denoising is done by using spectral subtraction and Click removal is done by using an
adaptive filter method as the first step. Performance measures are done based on perception, average SNR calculation and
defined parameter variations. This paper also targeting towards the software and hardware implementation of the restoration
methods using TMS320C6713 DSK kit with help of tools mainly MATLAB and Code Composer studio.
Key Words: Audio Distortions, OMP algorithm, Gabor/DCT dictionary, TMS320C6713DSK
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...kevig
This study investigates the effectiveness of Knowledge Named Entity Recognition in Online Judges (OJs). OJs are lacking in the classification of topics and limited to the IDs only. Therefore a lot of time is consumed in finding programming problems more specifically in knowledge entities.A Bidirectional Long Short-Term Memory (BiLSTM) with Conditional Random Fields (CRF) model is applied for the recognition of knowledge named entities existing in the solution reports.For the test run, more than 2000 solution reports are crawled from the Online Judges and processed for the model output. The stability of the model is also assessed with the higher F1 value. The results obtained through the proposed BiLSTM-CRF model are more effectual (F1: 98.96%) and efficient in lead-time.
Detection of speech under stress using spectral analysiseSAT Journals
Abstract This paper deals with an approach to the detection of speech in English language. The stress detection is necessary which provides real time information of state of mind of a person. Voice features from the speech signal is influenced by stress is MFCC is considered is this paper. To examine the effect of Exam-Stress on speech production an experiment was designed. First Year students of age group 18 to 20 were selected and assignment was given to them and instructs them that have viva on that assignment and their performance in the viva will decide their final internal marks in the examination. The experiment and the analysis of the test results are reported in this paper. Keywords: Speech, Stress, Spectral Analysis, Discrete Wavelet Transform, Artificial Neural Network.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
QUALITATIVE ANALYSIS OF PLP IN LSTM FOR BANGLA SPEECH RECOGNITIONijma
The performance of various acoustic feature extraction methods has been compared in this work using
Long Short-Term Memory (LSTM) neural network in a Bangla speech recognition system. The acoustic
features are a series of vectors that represents the speech signals. They can be classified in either words or
sub word units such as phonemes. In this work, at first linear predictive coding (LPC) is used as acoustic
vector extraction technique. LPC has been chosen due to its widespread popularity. Then other vector
extraction techniques like Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction
(PLP) have also been used. These two methods closely resemble the human auditory system. These feature
vectors are then trained using the LSTM neural network. Then the obtained models of different phonemes
are compared with different statistical tools namely Bhattacharyya Distance and Mahalanobis Distance to
investigate the nature of those acoustic features.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Isolated word recognition using lpc & vector quantization
1. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 01 Issue: 03 | Nov-2012, Available @ http://www.ijret.org 479
ISOLATED WORD RECOGNITION USING LPC & VECTOR
QUANTIZATION
M. K. Linga Murthy1
, G.L.N. Murthy2
1,2
Asst. Professor, ECE, Lakireddy Balireddy College of Engineering, Andhra Pradesh, India,
lingamurthy413@gmail.com, murthyfromtenali@gmail.com
Abstract
Speech recognition is always looked upon as a fascinating field in human computer interaction. It is one of the fundamental steps
towards understanding human recognition and their behavior. This paper explicates the theory and implementation of Speech
recognition. This is a speaker-dependent real time isolated word recognizer. The major logic used was to first obtain the feature
vectors using LPC which was followed by vector quantization. The quantized vectors were then recognized by measuring the
Minimum average distortion.
All Speech Recognition systems contain Two Main Phases, namely Training Phase and Testing Phase. In the Training Phase, the
Features of the words are extracted and during the recognition phase feature matching Takes place. The feature or the template thus
extracted is stored in the data base, during the recognition phase the extracted features are compared with the template in the
database. The features of the words are extracted by using LPC analysis. Vector Quantization is used for generating the code books.
Finally the recognition decision is made based on the matching score. MATLAB will be used to implement this concept to achieve
further understanding.
Index Terms: Speech Recognition, LPC, Vector Quantization, and Code Book.
-----------------------------------------------------------------------***-----------------------------------------------------------------------
1. INTRODUCTION
Speech is a natural mode of communication for people. We
learn all the relevant skills during early childhood, without
instruction, and we continue to rely on speech communication
throughout our lives. It comes so naturally to us that we don’t
realize how complex a phenomenon speech.
Speech recognition, or more commonly known as automatic
speech recognition (ASR), is the process of interpreting
human speech in a computer. A more technical definition is
given by Jurafsky, where he defines ASR as the building of
system for mapping acoustic signals to a string of words. He
continues by defining automatic speech understanding (ASU)
as extending the goal to producing some sort of understanding
of the sentence.
1.1 Challenges
The general problem of automatic transcription of speech by
any speaker in any environment is still far from solved. But
recent years have seen ASR technology mature to the point
where it is viable in certain limited domains.
1.2 Difficulties
One dimension of variation in speech recognition tasks is the
vocabulary size.
A second dimension of variation is how fluent, natural or
conversational the speech is isolated word recognition, in
which each word is surrounded by some sort of pause, is much
easier than recognizing continuous speech
A third dimension of variation is channel and noise.
Commercial dictation systems, and much laboratory research
in speech recognition, is done with high quality, head mounted
microphones
A final dimension of variation is accent or speaker-class
characteristics. The objective of this paper is to recognize the
isolated words spoken by the speaker. These results are very
useful for implementing the recognition systems. The words or
utterances are recorded by Microphone and are stored in work
space, then processed using MATLAB signal processing
toolbox. It involves pre-emphasis, frame blocking
autocorrelation analysis, LPC Analysis and Vector
quantization.
2. LPC FOR SPEECH RECOGNITION:
There are three basic steps in Speech Recognition, they are
Parameter estimation. ( in which the test pattern
is created )
Parameter Comparison.
Decision making.
2. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 01 Issue: 03 | Nov-2012, Available @ http://www.ijret.org 480
Fig2.1 Block diagram for LPC Processor for Speech
Recognition.
2.1 Pre – emphasis:
From the speech production model it is known that the speech
undergoes a spectral tilt of -6dB/oct. To counteract this fact a
pre-emphasis filter is used. The main goal of the pre-emphasis
filter is to boost the higher frequencies in order to flatten the
spectrum. Pre emphasis follows a 6 dB per octave rate. This
means that as the frequency doubles, the amplitude increases 6
dB. This is usually done between 300 - 3000 cycles. Pre
emphasis is needed in FM to maintain good signal to noise
ratio. Perhaps the most widely used pre emphasis network is
the fixed first-order system:
2.2 Frame – Blocking:
In this step the pre-emphasized speech signal is blocked into
frames of N samples, with adjacent frames being separated by
M samples .Thus frame blocking is done to reduce the mean
squared predication error over a short segment of the speech
wave form. In this step the pre emphasized speech
signal, is blocked into frames of N samples, with
adjacent frames being separated by M samples.
Fig2.2 blocking of speech into overlapping frames.
Typical values for N and M are 256 and 128 when the
sampling rate of the speech is 6.67 kHz. These correspond to
45-msec frames, separated by 15-msec, or a 66.7-Hz frame
rate.
2.3 Windowing:
Here we want to extract spectral features of entire utterance or
conversation, but the spectrum changes very quickly.
Technically, we say that speech is a non-stationary signal,
meaning that its statistical properties are not constant across
time. Instead, we want or extract spectral features from a
small window of speech that characterizes a particular sub-
phone and for which we can make the assumption that the
signal is stationary. this is done by using a window which is
non-zero inside some region and zero elsewhere, running this
window across the speech signal, and extracting the waveform
inside this window. A more common window used in feature
extraction is the Hamming window, which shrinks the values
of the signal toward zero at the window boundaries, avoiding
discontinuities.
2.4 Autocorrelation analysis:
Each frame of windowing signal is next auto correlated to give
Where the highest autocorrelation value, p, is the order of the
LPC analysis. Typically, values of p from 8 to 16 have been
used, with p = 10 being the value used for this systems. A side
benefit of the autocorrelation analysis is that the zeroth
autocorrelation, , is the energy of the frame.
2.5 LPC Analysis:
The next processing step is the LPC analysis, which converts
each frame of P+1 autocorrelations into an LPC parameter set
in which the set might be the LPC coefficients, the reflection
coefficients(PARCOR-co-efficient), the log area ratio
coefficients, or any desired information of the above sets. The
formal method for converting from autocorrelation
coefficients to an LPC parameter set ( for the Autocorrelation
method ) is known as Durbin’s method
2.6 LPC parameter conversion to Cepstral
coefficients:
A very important LPC parameter set, which can be derived
directly from the LPC coefficients set, is the LPC Cepstral
coefficients, . The recursion method is used. The
Cepstral coefficients, which are the coefficients of the Fourier
transform representation of the log magnitude spectrum, have
been shown to be a more robust, reliable feature set for speech
3. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 01 Issue: 03 | Nov-2012, Available @ http://www.ijret.org 481
recognition than the LPC coefficients, the PARCOR
coefficients, or the log area ratio coefficients.
3. VECTOR QUANTIZATION:
Vector quantization is one very efficient source-coding
technique. Vector quantization is a procedure that encodes a
vector of input (e.g., a segment of waveform or a parameter
vector that represents the segment spectrum) into an integer
(index) that is associated with an entry of a collection
(codebook) of reproduction vectors.
Using the basic pattern-recognition approach, the input
(unknown class or word) Speech pattern is next compared
with each class (or word) reference pattern and a measure
(score) of similarity between the unknown pattern and each
reference pattern is calculated.
Fig 4.1 A Vector Quantized based speech recognition system
It is the pattern recognition based approach to speech
recognition, we see that the M codebooks are analogous to M
(sets of) reference patterns (or templates) and the dissimilarity
measure is defined according to Eq. (4.1) and Eq. (4.2), where
no explicit time alignment is required.
The utterance is recognized as class K if
Where is a minimum average distortion.
And is an M average distortion score of all code
books.
4. SYSTEM IMPLEMENTATION:
The training set for the vector quantizer was obtained by
recording the utterances of set isolated words .The words are
recorded for a two different speakers. The recognition
vocabulary consisted of the names (Forward, Back, Left,
Right, and Stop). Here each word is applied for 10 times .The
results obtained are shown in the table below.
Fig 4.1 Comparisons for isolated word recognition system for
two speakers
CONCLUSIONS
Isolated Word Recognition using Linear Predictive Coding
and Vector Quantization provides basic idea for implementing
for Speech Recognition for Isolated Words. The Vector
Quantized based speech recognition is very simple method. In
this method by using LPC analysis extracts the features of
given words & vector quantization is used for feature
matching in Speech Recognition.
In the successful implementation the results were found to be
satisfactory considering less number of training data. The
accuracy of the real time system can be increased significantly
by using an improved speech detection/noise elimination
algorithm.
REFERENCES
[1]. L.R.Rabiner and B.H.Juang, “Fundamentals of
speech recognition”, Prentice Hall (Signal Processing
series) 1993.
[2]. Richard O.Duda, Peter E.Hart, David
G.Stork,“Pattern Classification”, John Wiley & Sons
(ASIA) Pte Ltd.
[3]. Y.Linde ,A.Buzo and R.M.Gray, “An algorithm for
vector quantizer design” ,IEEE Trans .COM-
28,January 1980 .
[4]. Mayukh Bhaowal and kunal chawla , “Isolated word
recognition for English language using LPC,VQ &
HMM” , students of IIT, Allahabad, India.
[5]. Poonam Bansal , Amita dev and Shail Bala Jain
“Automatic speaker Identification using VQ “,
Medwell journals,6(9) : 938 – 942 ,2007.
[6]. Lawrence R. Rabiner “Applications of speech
recognition in the area of Telecommunications”,
AT&T Labs Florham Park, New Jersey 07932, 0-
7803-3698-4/97/$10.00 0 1997 IEEE
[7]. L. R. Rabiner, “Applications of Voice Processing to
Telecommunications”, Proc. IEEE, Vol. 82, No. 4,
pp. 199-228, Feb. 1994.
4. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 01 Issue: 03 | Nov-2012, Available @ http://www.ijret.org 482
[8]. L. R. Rabiner and R.W. Schafer “Digital processing
of Speech signals”, Prentice Hall (Signal Processing
series).
[9]. J.E.Shore and D.K.Burton “Discrete Utterance
Speech recognition without Time alignment”, IEEE
Transactions on IT, IT – 29(4), July 1983, pp. 473 –
491.
[10]. D. K. Burton, J. T. Buck, and F. Shore, "Parameter
Selection for Isolated Word Recognition Using
Vector Quantization," Proc. ICASSP 84, San Diego,
CA, pp. 9.4.1- 9.4.4, March 1984.
[11]. Douglas o’shaughnesy “Speech Communication”,
Universities press Electrical engineering Series, 2/e
,2001
BIOGRAPHIES:
M. K. Linga Murthy is currently
working as Asst. Professor in ECE in
LBRCE, Mylavaram. He completed
his B.Tech in the year 2001 at SJCET,
Yemmiganur. He completed his
M.Tech in the year 2008 at MITS,
Madanapalle. He has over 06 years of
teaching experience & his research an
area is signal processing.
G.L.N Murthy is currently working as
Asso. Professor in ECE in LBRCE,
Mylavaram. He completed his B.Tech in
JNTU, Anantapu. He completed his
M.Tech in JNTU, Anantapur. He
pursuing his PhD in SVU Tirupathi, he
has over 12 years of teaching experience
& his research an area is signal
processing