Complex Question answering system is developed to answer different types of questions accurately. Initially the question from the natural language is transformed to an internal representation which captures the semantics and intent of the question. In the proposed work, internal representation is provided with templates instead of using synonyms or keywords. Then for each internal representation, it is mapped to relevant query against the knowledge base. In present work, the Template representation based Convolutional Recurrent Neural Network (T-CRNN) is proposed for selecting answer in Complex Question Answering (CQA) framework. Recurrent neural network is used to obtain the exact correlation between answers and questions and the semantic matching among the collection of answers. Initially, the process of learning is accomplished through Convolutional Neural Network (CNN) which represents the questions and answers separately. Then the representation with fixed length is produced for each question with the help of fully connected neural network. In order to design the semantic matching between the answers, the representation of Question Answer (QA) pair is given into the Recurrent Neural Network (RNN). Finally, for the given question, the correctly correlated answers are identified with the softmax classifier.
Generation of Question and Answer from Unstructured Document using Gaussian M...IJACEE IJACEE
The document describes a system that automatically generates questions and answers from an unstructured document. It involves several steps: (1) simplifying complex sentences, (2) generating initial questions using named entities and semantic role labeling, (3) identifying subtopics using LDA and GMNTM models, (4) measuring syntactic correctness of questions, and (5) extracting answers using pattern matching. The system is expected to produce more accurate results compared to using only LDA for subtopic identification, as GMNTM also considers word order and semantics. Key techniques include semantic role labeling, Extended String Subsequence Kernel for similarity measurement, and syntactic tree kernel for question ranking.
IRJET- An Automated Approach to Conduct Pune University’s In-Sem ExaminationIRJET Journal
This document describes a proposed automated system for evaluating descriptive answers on technical exams in Pune University. The system would take students' written answers as input and assess them based on predefined model answers, scoring answers based on keywords, context, structure, and cosine similarity to the model answers. This would help address issues with current evaluation methods like human error, inconsistency, and resource intensiveness. The proposed system uses natural language processing and machine learning techniques to classify answers and determine scores. It aims to provide a more accurate and efficient evaluation process.
IRJET- Sentimental Analysis on Audio and VideoIRJET Journal
1) The document discusses sentiment analysis on audio and video content using techniques like automatic speech recognition and behavior detection. It aims to analyze sentiment expressed by entities in natural audio and video.
2) Sentiment detection has been explored more for text, while video sentiment detection is relatively under-explored. The proposed approach uses algorithms to tag audio with text, and can extend to detect scenarios and behaviors in video after audio and video feature extraction.
3) The approach uses part-of-speech tagging to extract text features for sentiment classification (positive or negative) using support vector machines and evaluates on public text and video datasets. This allows identification of keywords/phrases carrying important information to enhance search.
Question Answering has been a well-researched NLP area over recent years. It has become necessary for
users to be able to query through the variety of information available - be it structured or unstructured. In
this paper, we propose a Question Answering module which a) can consume a variety of data formats - a
heterogeneous data pipeline, which ingests data from product manuals, technical data forums, internal
discussion forums, groups, etc. b) addresses practical challenges faced in real-life situations by pointing to
the exact segment of the manual or chat threads which can solve a user query c) provides segments of texts
when deemed relevant, based on user query and business context. Our solution provides a comprehensive
and detailed pipeline that is composed of elaborate data ingestion, data parsing, indexing, and querying
modules. Our solution is capable of handling a plethora of data sources such as text, images, tables,
community forums, and flow charts. Our studies performed on a variety of business-specific datasets
represent the necessity of custom pipelines like the proposed one to solve several real-world document
question-answering
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...IRJET Journal
This document presents a proposed system for performing sentiment analysis on audio and video reviews from social media platforms. The system first collects audio and video data from sites like YouTube and Facebook. It then separates the audio and video files, converts them to .wav format, and extracts text from the audio and video files. This extracted text is then analyzed using the VADER sentiment analysis algorithm to determine the sentiment polarity (positive, negative, neutral) expressed in the text. VADER is a lexicon-based approach that rates words based on sentiment and calculates overall sentiment scores. The proposed system aims to analyze sentiment in audio and video reviews to better understand user opinions expressed across various social media platforms.
2. an efficient approach for web query preprocessing edit satIAESIJEECS
The emergence of the Web technology generated a massive amount of raw data by enabling Internet users to post their opinions, comments, and reviews on the web. To extract useful information from this raw data can be a very challenging task. Search engines play a critical role in these circumstances. User queries are becoming main issues for the search engines. Therefore a preprocessing operation is essential. In this paper, we present a framework for natural language preprocessing for efficient data retrieval and some of the required processing for effective retrieval such as elongated word handling, stop word removal, stemming, etc. This manuscript starts by building a manually annotated dataset and then takes the reader through the detailed steps of process. Experiments are conducted for special stages of this process to examine the accuracy of the system.
IRJET- Text Document Clustering using K-Means Algorithm IRJET Journal
This document discusses using the K-Means clustering algorithm to cluster text documents and compares it to using K-Means clustering with dimension reduction techniques. It uses the BBC Sports dataset containing 737 documents in 5 classes. The document outlines preprocessing the text, creating a document term matrix, applying K-Means clustering, and using dimension reduction techniques like InfoGain before clustering. It evaluates the different methods using precision, recall, accuracy, and F-measure, finding that K-Means with InfoGain dimension reduction outperforms standard K-Means clustering.
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGEijaia
This document discusses a proposed question answering system for the Marathi language that uses ontology as a knowledge base. The system aims to provide accurate answers to user questions in Marathi by analyzing queries semantically using ontologies. Ontologies are developed with help from domain experts and represent domain knowledge through semantic relations. The system first analyzes user questions syntactically and semantically. It then extracts candidate answers from the ontology and generates a precise answer in Marathi language to satisfy the original user query. The use of ontology for semantic analysis is meant to enhance the accuracy of answers provided by the question answering system.
Generation of Question and Answer from Unstructured Document using Gaussian M...IJACEE IJACEE
The document describes a system that automatically generates questions and answers from an unstructured document. It involves several steps: (1) simplifying complex sentences, (2) generating initial questions using named entities and semantic role labeling, (3) identifying subtopics using LDA and GMNTM models, (4) measuring syntactic correctness of questions, and (5) extracting answers using pattern matching. The system is expected to produce more accurate results compared to using only LDA for subtopic identification, as GMNTM also considers word order and semantics. Key techniques include semantic role labeling, Extended String Subsequence Kernel for similarity measurement, and syntactic tree kernel for question ranking.
IRJET- An Automated Approach to Conduct Pune University’s In-Sem ExaminationIRJET Journal
This document describes a proposed automated system for evaluating descriptive answers on technical exams in Pune University. The system would take students' written answers as input and assess them based on predefined model answers, scoring answers based on keywords, context, structure, and cosine similarity to the model answers. This would help address issues with current evaluation methods like human error, inconsistency, and resource intensiveness. The proposed system uses natural language processing and machine learning techniques to classify answers and determine scores. It aims to provide a more accurate and efficient evaluation process.
IRJET- Sentimental Analysis on Audio and VideoIRJET Journal
1) The document discusses sentiment analysis on audio and video content using techniques like automatic speech recognition and behavior detection. It aims to analyze sentiment expressed by entities in natural audio and video.
2) Sentiment detection has been explored more for text, while video sentiment detection is relatively under-explored. The proposed approach uses algorithms to tag audio with text, and can extend to detect scenarios and behaviors in video after audio and video feature extraction.
3) The approach uses part-of-speech tagging to extract text features for sentiment classification (positive or negative) using support vector machines and evaluates on public text and video datasets. This allows identification of keywords/phrases carrying important information to enhance search.
Question Answering has been a well-researched NLP area over recent years. It has become necessary for
users to be able to query through the variety of information available - be it structured or unstructured. In
this paper, we propose a Question Answering module which a) can consume a variety of data formats - a
heterogeneous data pipeline, which ingests data from product manuals, technical data forums, internal
discussion forums, groups, etc. b) addresses practical challenges faced in real-life situations by pointing to
the exact segment of the manual or chat threads which can solve a user query c) provides segments of texts
when deemed relevant, based on user query and business context. Our solution provides a comprehensive
and detailed pipeline that is composed of elaborate data ingestion, data parsing, indexing, and querying
modules. Our solution is capable of handling a plethora of data sources such as text, images, tables,
community forums, and flow charts. Our studies performed on a variety of business-specific datasets
represent the necessity of custom pipelines like the proposed one to solve several real-world document
question-answering
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...IRJET Journal
This document presents a proposed system for performing sentiment analysis on audio and video reviews from social media platforms. The system first collects audio and video data from sites like YouTube and Facebook. It then separates the audio and video files, converts them to .wav format, and extracts text from the audio and video files. This extracted text is then analyzed using the VADER sentiment analysis algorithm to determine the sentiment polarity (positive, negative, neutral) expressed in the text. VADER is a lexicon-based approach that rates words based on sentiment and calculates overall sentiment scores. The proposed system aims to analyze sentiment in audio and video reviews to better understand user opinions expressed across various social media platforms.
2. an efficient approach for web query preprocessing edit satIAESIJEECS
The emergence of the Web technology generated a massive amount of raw data by enabling Internet users to post their opinions, comments, and reviews on the web. To extract useful information from this raw data can be a very challenging task. Search engines play a critical role in these circumstances. User queries are becoming main issues for the search engines. Therefore a preprocessing operation is essential. In this paper, we present a framework for natural language preprocessing for efficient data retrieval and some of the required processing for effective retrieval such as elongated word handling, stop word removal, stemming, etc. This manuscript starts by building a manually annotated dataset and then takes the reader through the detailed steps of process. Experiments are conducted for special stages of this process to examine the accuracy of the system.
IRJET- Text Document Clustering using K-Means Algorithm IRJET Journal
This document discusses using the K-Means clustering algorithm to cluster text documents and compares it to using K-Means clustering with dimension reduction techniques. It uses the BBC Sports dataset containing 737 documents in 5 classes. The document outlines preprocessing the text, creating a document term matrix, applying K-Means clustering, and using dimension reduction techniques like InfoGain before clustering. It evaluates the different methods using precision, recall, accuracy, and F-measure, finding that K-Means with InfoGain dimension reduction outperforms standard K-Means clustering.
QUESTION ANSWERING SYSTEM USING ONTOLOGY IN MARATHI LANGUAGEijaia
This document discusses a proposed question answering system for the Marathi language that uses ontology as a knowledge base. The system aims to provide accurate answers to user questions in Marathi by analyzing queries semantically using ontologies. Ontologies are developed with help from domain experts and represent domain knowledge through semantic relations. The system first analyzes user questions syntactically and semantically. It then extracts candidate answers from the ontology and generates a precise answer in Marathi language to satisfy the original user query. The use of ontology for semantic analysis is meant to enhance the accuracy of answers provided by the question answering system.
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...IRJET Journal
This document presents a methodology for classifying mined online discussion data to identify reflective thinking based on ontology. It involves the following steps:
1. Collecting online discussion data and preprocessing it by removing stop words and punctuation.
2. Implementing inductive content analysis to categorize the data into six types of reflective thinking.
3. Training a Naive Bayes classifier on the categorized data to classify new data.
4. Applying the trained model to large scale unlabeled online discussion data.
5. Using ontology to provide a deeper classification of topics in the data beyond the six reflective thinking categories. This allows extraction of additional knowledge from the classified text data.
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf
Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text
classification. In this paper, Fast Fuzzy Feature clustering for text classification is proposed. It
is based on the framework proposed by Jung-Yi Jiang, Ren-Jia Liou and Shie-Jue Lee in 2011.
The word in the feature vector of the document is grouped into the cluster in less iteration. The
numbers of iterations required to obtain cluster centers are reduced by transforming clusters
center dimension from n-dimension to 2-dimension. Principle Component Analysis with slit
change is used for dimension reduction. Experimental results show that, this method improve
the performance by significantly reducing the number of iterations required to obtain the cluster
center. The same is being verified with three benchmark datasets
This document presents a system for detecting semantically similar questions in online forums like Quora to reduce duplicate content. It proposes using natural language processing techniques like tagging questions with keywords, vectorizing text with Google News vectors, and calculating similarity with Word Mover's Distance. The system cleans and preprocesses questions before generating tags and calculating similarity between questions to identify duplicates. An evaluation of the system achieved accurate detection of matching and non-matching question pairs.
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET Journal
This document discusses sentiment analysis techniques using machine learning. It provides an overview of various supervised and unsupervised machine learning algorithms that can be used for sentiment analysis, including Naive Bayes, SVM, neural networks, decision trees, and BERT. The document also describes the system architecture of a proposed sentiment analysis system that would use a BERT model to identify and classify sentiments in text data as positive, negative, or neutral after preprocessing the data. The system aims to improve sentiment analysis efficiency by taking a holistic approach to attribute identification and classification.
Predicting the Presence of Learning Motivation in Electronic Learning: A New ...TELKOMNIKA JOURNAL
Research affirms that electronic learning (e-learning) has a great deal of advantages to learning process, it’s provides learners with flexibility in terms of time, place, and pace. That easiness was not be a guarantee of a good learning outcomes though they got the same learning material and course, because the fact is the outcomes was not the same from one to another and even not satisfy enough. One of prerequisite to the successful of e-learning process is the presence of learning motivation and it was not easy to identify. We propose a novel model to predicting the presence of learning motivation in e-learning using those attributes that have been identified in previous research. This model has been built using WEKA toolkit by comparing fourteen algorithms for Tree Classifier and ten-fold cross validation testing methods to process 3.200 of data sets. The best accuracy reached at 91.1% and identified four parameters to predict the presence of learning motivation in e-learning, and aim to assist teachers in identify whether student needs motivation or does not. This study also confirmed that Tree Classifier still has the best accuracy to predict and classify academic performance, it was reached average 90.48% for fourteen algorithm for accuracy value.
Sentiment Analysis and Classification of Tweets using Data MiningIRJET Journal
This document summarizes research on using data mining techniques to perform sentiment analysis on tweets. The researchers collected tweets from Twitter and preprocessed the text to make it usable for building sentiment classifiers. They used three classifiers - K-Nearest Neighbor, Naive Bayes, and Decision Tree - and compared the results to determine which provided the best accuracy. Rapid Miner tool was used to preprocess the text, build the classifiers, and analyze the results. The goal was to determine people's sentiments expressed in their tweets and correctly classify them.
Intelligent information extraction based on artificial neural networkijfcstjournal
Question Answering System (QAS) is used for information retrieval and natural language processing
(NLP) to reduce human effort. There are numerous QAS based on the user documents present today, but
they all are limited to providing objective answers and process simple questions only. Complex questions
cannot be answered by the existing QAS, as they require interpretation of the current and old data as well
as the question asked by the user. The above limitations can be overcome by using deep cases and neural
network. Hence we propose a modified QAS in which we create a deep artificial neural network with
associative memory from text documents. The modified QAS processes the contents of the text document
provided to it and find the answer to even complex questions in the documents.
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
This document discusses a proposed system for deep collaborative filtering with aspect information. The system aims to help web users efficiently locate relevant information on unfamiliar topics to increase their knowledge. It utilizes techniques like multi-keyword search, synonym matching, and ontology mapping to return relevant web links, images, and news articles to the user based on their search terms. The proposed system architecture includes an index structure to efficiently search and rank results based on similarity to the search query terms. The implementation and evaluation of the proposed system are also discussed.
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERINGdannyijwest
Machine Reading Comprehension (MRC), particularly extractive close-domain question-answering, is a prominent field in Natural Language Processing (NLP). Given a question and a passage or set of passages, a machine must be able to extract the appropriate answer from the passage(s). However, the majority of these existing questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Runtime of base models on the entire datasetis approximately one day while the runtime for all models on a third of the dataset is a little over two days. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset compared to the single-span source datasets. While the models tested on the source datasets were slightly fine-tuned in order to return multiple answers, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in question-answering and improve existing question-answering products and methods
Predicting depression using deep learning and ensemble algorithms on raw twit...IJECEIAES
Social network and microblogging sites such as Twitter are widespread amongst all generations nowadays where people connect and share their feelings, emotions, pursuits etc. Depression, one of the most common mental disorder, is an acute state of sadness where person loses interest in all activities. If not treated immediately this can result in dire consequences such as death. In this era of virtual world, people are more comfortable in expressing their emotions in such sites as they have become a part and parcel of everyday lives. The research put forth thus, employs machine learning classifiers on the twitter data set to detect if a person’s tweet indicates any sign of depression or not.
This document summarizes a proposed framework for sentiment classification using fuzzy logic. The framework aims to detect both implicit and explicit sentiment expressions in text by incorporating multiple datasets and techniques. It involves preprocessing text data, classifying sentiment, applying fuzzy logic to reduce emotions, and using fuzzy c-means clustering to further group similar emotions. The goal is to more accurately extract sentiment from transcripts by identifying both implicit and explicit expressions as well as topics through this combined approach. Evaluation metrics like precision, recall and f-measure will be used to assess performance.
A scalable, lexicon based technique for sentiment analysisijfcstjournal
Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased
interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much
social media available on the web, sentiment analysis is now considered as a big data task. Hence the
conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data
available now a days. The main focus of the research was to find such a technique that can efficiently
perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative
and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large
data set of tweets using Hadoop and the performance of the technique was measured in form of speed and
accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big
sentiment data sets.
Survey of Various Approaches of Emotion Detection Via Multimodal ApproachIRJET Journal
This document summarizes various approaches for multimodal emotion detection using features extracted from text, audio, and video data. It discusses how combining multiple modalities can provide more comprehensive insight into a user's emotions compared to a single modality. The document reviews related literature on audio-video, text-video, and multimodal emotion classification systems. It also describes approaches for feature extraction from text, such as identifying semantic words and concepts, and from video, including face detection and facial feature extraction. The proposed system aims to predict emotions by combining features from Twitter text and real-time video captured while a user completes a depression questionnaire.
This document summarizes a journal article that evaluates an e-assessment system for automatically scoring short-answer free-text questions. The system uses natural language processing techniques to match student responses to predefined model answers. A study was conducted using this system to assess Open University students. Results found the system could accurately score responses, with accuracy similar to or greater than human markers. The system provides immediate feedback to students to help them learn from mistakes.
A Novel Approach for Keyword extraction in learning objects using text miningIJSRD
Keyword extraction, concept finding are in learning objects is very important subject in today’s eLearning environment. Keywords are subset of words that contains the useful information about the content of the document. Keyword extraction is a process that is used to get the important keywords from documents. In this proposed System Decision tree algorithm is used for feature selection process using wordnet dictionary. WordNet is a lexical database of English which is used to find similarity from the candidate words. The words having highest similarity are taken as keywords.
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
This document discusses approaches to resolving ambiguity in user queries to search engines. It first introduces the problem of ambiguity, where a single search term can have multiple meanings. It then overviews existing ranking algorithms used by search engines. The document proposes a dynamic page ranking algorithm to better disambiguate user intent by considering the context of the query. It evaluates the approach using mean average precision on the ambiguous term "beat". Experimental results show the dynamic approach outperforms a baseline page ranking algorithm on this task.
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...IRJET Journal
The document proposes using statistical machine translation via non-negative matrix factorization to address word ambiguity and mismatch problems in question retrieval for community question answering systems. It translates questions into other languages using Google Translate to leverage contextual information, representing the original and translated questions together in a matrix. Experimental results on a real CQA dataset show this approach improves over methods relying only on surface text matching.
Manta ray optimized deep contextualized bi-directional long short-term memor...IJECEIAES
Complex question answering (CQA) is used for human knowledge answering and community questions answering. CQA system is essential to overcome the complexities present in the question answering system. The existing techniques ignores the queries structure and resulting a significant number of noisy queries. The complex queries, distributed knowledge, composite approaches, templates, and ambiguity are the common challenges faced by the CQA. To solve these issues, this paper presents a new manta ray foraging optimized deep contextualized bidirectional long-short term memory based adaptive galactic swarm optimization (MDCBiLSTMAGSO) for CQA. At first, the given input question is preprocessed and the similarity assessment is performed to eliminate the misclassification. Afterwards, the attained keywords are mapped into applicant results to improve the answer selection. Next, a new similarity approach named InfoSelectivity is introduced for semantic similarity evaluation based on the closeness among elements. Then, the relevant answers are classified through the MDCBiLSTM and optimized by a new manta ray foraging optimization (MRFO). Finally, adaptive galactic swarm optimization (AGSO) resultant is the best output. The proposed scheme is implemented on the JAVA platform and the outputs of designed approach achieved the better results when compared with the existing approaches in average accuracy (98.2%).
Architecture of an ontology based domain-specific natural language question a...IJwest
The document summarizes the architecture of an ontology-based domain-specific natural language question answering system. The proposed architecture defines four main modules: 1) question processing which analyzes and classifies questions and reformulates queries, 2) document retrieval which retrieves relevant documents, 3) document processing which processes retrieved documents, and 4) answer extraction which extracts and generates responses. Natural language processing techniques and ontologies are used to analyze questions and documents and extract relationships and answers. The system aims to generate concise, specific answers to natural language questions in a given domain and achieved 94% accuracy in testing.
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...IRJET Journal
This document presents a methodology for classifying mined online discussion data to identify reflective thinking based on ontology. It involves the following steps:
1. Collecting online discussion data and preprocessing it by removing stop words and punctuation.
2. Implementing inductive content analysis to categorize the data into six types of reflective thinking.
3. Training a Naive Bayes classifier on the categorized data to classify new data.
4. Applying the trained model to large scale unlabeled online discussion data.
5. Using ontology to provide a deeper classification of topics in the data beyond the six reflective thinking categories. This allows extraction of additional knowledge from the classified text data.
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf
Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text
classification. In this paper, Fast Fuzzy Feature clustering for text classification is proposed. It
is based on the framework proposed by Jung-Yi Jiang, Ren-Jia Liou and Shie-Jue Lee in 2011.
The word in the feature vector of the document is grouped into the cluster in less iteration. The
numbers of iterations required to obtain cluster centers are reduced by transforming clusters
center dimension from n-dimension to 2-dimension. Principle Component Analysis with slit
change is used for dimension reduction. Experimental results show that, this method improve
the performance by significantly reducing the number of iterations required to obtain the cluster
center. The same is being verified with three benchmark datasets
This document presents a system for detecting semantically similar questions in online forums like Quora to reduce duplicate content. It proposes using natural language processing techniques like tagging questions with keywords, vectorizing text with Google News vectors, and calculating similarity with Word Mover's Distance. The system cleans and preprocesses questions before generating tags and calculating similarity between questions to identify duplicates. An evaluation of the system achieved accurate detection of matching and non-matching question pairs.
IRJET- Sentiment Analysis to Segregate Attributes using Machine Learning Tech...IRJET Journal
This document discusses sentiment analysis techniques using machine learning. It provides an overview of various supervised and unsupervised machine learning algorithms that can be used for sentiment analysis, including Naive Bayes, SVM, neural networks, decision trees, and BERT. The document also describes the system architecture of a proposed sentiment analysis system that would use a BERT model to identify and classify sentiments in text data as positive, negative, or neutral after preprocessing the data. The system aims to improve sentiment analysis efficiency by taking a holistic approach to attribute identification and classification.
Predicting the Presence of Learning Motivation in Electronic Learning: A New ...TELKOMNIKA JOURNAL
Research affirms that electronic learning (e-learning) has a great deal of advantages to learning process, it’s provides learners with flexibility in terms of time, place, and pace. That easiness was not be a guarantee of a good learning outcomes though they got the same learning material and course, because the fact is the outcomes was not the same from one to another and even not satisfy enough. One of prerequisite to the successful of e-learning process is the presence of learning motivation and it was not easy to identify. We propose a novel model to predicting the presence of learning motivation in e-learning using those attributes that have been identified in previous research. This model has been built using WEKA toolkit by comparing fourteen algorithms for Tree Classifier and ten-fold cross validation testing methods to process 3.200 of data sets. The best accuracy reached at 91.1% and identified four parameters to predict the presence of learning motivation in e-learning, and aim to assist teachers in identify whether student needs motivation or does not. This study also confirmed that Tree Classifier still has the best accuracy to predict and classify academic performance, it was reached average 90.48% for fourteen algorithm for accuracy value.
Sentiment Analysis and Classification of Tweets using Data MiningIRJET Journal
This document summarizes research on using data mining techniques to perform sentiment analysis on tweets. The researchers collected tweets from Twitter and preprocessed the text to make it usable for building sentiment classifiers. They used three classifiers - K-Nearest Neighbor, Naive Bayes, and Decision Tree - and compared the results to determine which provided the best accuracy. Rapid Miner tool was used to preprocess the text, build the classifiers, and analyze the results. The goal was to determine people's sentiments expressed in their tweets and correctly classify them.
Intelligent information extraction based on artificial neural networkijfcstjournal
Question Answering System (QAS) is used for information retrieval and natural language processing
(NLP) to reduce human effort. There are numerous QAS based on the user documents present today, but
they all are limited to providing objective answers and process simple questions only. Complex questions
cannot be answered by the existing QAS, as they require interpretation of the current and old data as well
as the question asked by the user. The above limitations can be overcome by using deep cases and neural
network. Hence we propose a modified QAS in which we create a deep artificial neural network with
associative memory from text documents. The modified QAS processes the contents of the text document
provided to it and find the answer to even complex questions in the documents.
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
This document discusses a proposed system for deep collaborative filtering with aspect information. The system aims to help web users efficiently locate relevant information on unfamiliar topics to increase their knowledge. It utilizes techniques like multi-keyword search, synonym matching, and ontology mapping to return relevant web links, images, and news articles to the user based on their search terms. The proposed system architecture includes an index structure to efficiently search and rank results based on similarity to the search query terms. The implementation and evaluation of the proposed system are also discussed.
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERINGdannyijwest
Machine Reading Comprehension (MRC), particularly extractive close-domain question-answering, is a prominent field in Natural Language Processing (NLP). Given a question and a passage or set of passages, a machine must be able to extract the appropriate answer from the passage(s). However, the majority of these existing questions have only one answer, and more substantial testing on questions with multiple answers, or multi-span questions, has not yet been applied. Thus, we introduce a newly compiled dataset consisting of questions with multiple answers that originate from previously existing datasets. In addition, we run BERT-based models pre-trained for question-answering on our constructed dataset to evaluate their reading comprehension abilities. Runtime of base models on the entire datasetis approximately one day while the runtime for all models on a third of the dataset is a little over two days. Among the three of BERT-based models we ran, RoBERTa exhibits the highest consistent performance, regardless of size. We find that all our models perform similarly on this new, multi-span dataset compared to the single-span source datasets. While the models tested on the source datasets were slightly fine-tuned in order to return multiple answers, performance is similar enough to judge that task formulation does not drastically affect question-answering abilities. Our evaluations indicate that these models are indeed capable of adjusting to answer questions that require multiple answers. We hope that our findings will assist future development in question-answering and improve existing question-answering products and methods
Predicting depression using deep learning and ensemble algorithms on raw twit...IJECEIAES
Social network and microblogging sites such as Twitter are widespread amongst all generations nowadays where people connect and share their feelings, emotions, pursuits etc. Depression, one of the most common mental disorder, is an acute state of sadness where person loses interest in all activities. If not treated immediately this can result in dire consequences such as death. In this era of virtual world, people are more comfortable in expressing their emotions in such sites as they have become a part and parcel of everyday lives. The research put forth thus, employs machine learning classifiers on the twitter data set to detect if a person’s tweet indicates any sign of depression or not.
This document summarizes a proposed framework for sentiment classification using fuzzy logic. The framework aims to detect both implicit and explicit sentiment expressions in text by incorporating multiple datasets and techniques. It involves preprocessing text data, classifying sentiment, applying fuzzy logic to reduce emotions, and using fuzzy c-means clustering to further group similar emotions. The goal is to more accurately extract sentiment from transcripts by identifying both implicit and explicit expressions as well as topics through this combined approach. Evaluation metrics like precision, recall and f-measure will be used to assess performance.
A scalable, lexicon based technique for sentiment analysisijfcstjournal
Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased
interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much
social media available on the web, sentiment analysis is now considered as a big data task. Hence the
conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data
available now a days. The main focus of the research was to find such a technique that can efficiently
perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative
and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large
data set of tweets using Hadoop and the performance of the technique was measured in form of speed and
accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big
sentiment data sets.
Survey of Various Approaches of Emotion Detection Via Multimodal ApproachIRJET Journal
This document summarizes various approaches for multimodal emotion detection using features extracted from text, audio, and video data. It discusses how combining multiple modalities can provide more comprehensive insight into a user's emotions compared to a single modality. The document reviews related literature on audio-video, text-video, and multimodal emotion classification systems. It also describes approaches for feature extraction from text, such as identifying semantic words and concepts, and from video, including face detection and facial feature extraction. The proposed system aims to predict emotions by combining features from Twitter text and real-time video captured while a user completes a depression questionnaire.
This document summarizes a journal article that evaluates an e-assessment system for automatically scoring short-answer free-text questions. The system uses natural language processing techniques to match student responses to predefined model answers. A study was conducted using this system to assess Open University students. Results found the system could accurately score responses, with accuracy similar to or greater than human markers. The system provides immediate feedback to students to help them learn from mistakes.
A Novel Approach for Keyword extraction in learning objects using text miningIJSRD
Keyword extraction, concept finding are in learning objects is very important subject in today’s eLearning environment. Keywords are subset of words that contains the useful information about the content of the document. Keyword extraction is a process that is used to get the important keywords from documents. In this proposed System Decision tree algorithm is used for feature selection process using wordnet dictionary. WordNet is a lexical database of English which is used to find similarity from the candidate words. The words having highest similarity are taken as keywords.
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
This document discusses approaches to resolving ambiguity in user queries to search engines. It first introduces the problem of ambiguity, where a single search term can have multiple meanings. It then overviews existing ranking algorithms used by search engines. The document proposes a dynamic page ranking algorithm to better disambiguate user intent by considering the context of the query. It evaluates the approach using mean average precision on the ambiguous term "beat". Experimental results show the dynamic approach outperforms a baseline page ranking algorithm on this task.
Question Retrieval in Community Question Answering via NON-Negative Matrix Fa...IRJET Journal
The document proposes using statistical machine translation via non-negative matrix factorization to address word ambiguity and mismatch problems in question retrieval for community question answering systems. It translates questions into other languages using Google Translate to leverage contextual information, representing the original and translated questions together in a matrix. Experimental results on a real CQA dataset show this approach improves over methods relying only on surface text matching.
Manta ray optimized deep contextualized bi-directional long short-term memor...IJECEIAES
Complex question answering (CQA) is used for human knowledge answering and community questions answering. CQA system is essential to overcome the complexities present in the question answering system. The existing techniques ignores the queries structure and resulting a significant number of noisy queries. The complex queries, distributed knowledge, composite approaches, templates, and ambiguity are the common challenges faced by the CQA. To solve these issues, this paper presents a new manta ray foraging optimized deep contextualized bidirectional long-short term memory based adaptive galactic swarm optimization (MDCBiLSTMAGSO) for CQA. At first, the given input question is preprocessed and the similarity assessment is performed to eliminate the misclassification. Afterwards, the attained keywords are mapped into applicant results to improve the answer selection. Next, a new similarity approach named InfoSelectivity is introduced for semantic similarity evaluation based on the closeness among elements. Then, the relevant answers are classified through the MDCBiLSTM and optimized by a new manta ray foraging optimization (MRFO). Finally, adaptive galactic swarm optimization (AGSO) resultant is the best output. The proposed scheme is implemented on the JAVA platform and the outputs of designed approach achieved the better results when compared with the existing approaches in average accuracy (98.2%).
Architecture of an ontology based domain-specific natural language question a...IJwest
The document summarizes the architecture of an ontology-based domain-specific natural language question answering system. The proposed architecture defines four main modules: 1) question processing which analyzes and classifies questions and reformulates queries, 2) document retrieval which retrieves relevant documents, 3) document processing which processes retrieved documents, and 4) answer extraction which extracts and generates responses. Natural language processing techniques and ontologies are used to analyze questions and documents and extract relationships and answers. The system aims to generate concise, specific answers to natural language questions in a given domain and achieved 94% accuracy in testing.
AN AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION USING NATURAL LANGUAGE PROCE...kevig
Automatic multiple-choice question generation (MCQG) is a useful yet challenging task in Natural Language
Processing (NLP). It is the task of automatic generation of correct and relevant questions from textual data.
Despite its usefulness, manually creating sizeable, meaningful and relevant questions is a time-consuming
and challenging task for teachers. In this paper, we present an NLP-based system for automatic MCQG for
Computer-Based Testing Examination (CBTE).We used NLP technique to extract keywords that are
important words in a given lesson material. To validate that the system is not perverse, five lesson materials
were used to check the effectiveness and efficiency of the system. The manually extracted keywords by the
teacher were compared to the auto-generated keywords and the result shows that the system was capable of
extracting keywords from lesson materials in setting examinable questions. This outcome is presented in a
user-friendly interface for easy accessibility.
IRJET- Factoid Question and Answering SystemIRJET Journal
This document describes a factoid question answering system that uses neural networks and the Tensorflow framework. The system takes in a text document and question as input. It then processes the input using techniques like gated recurrent units and support vector machines to classify the question. The system calculates attention between facts and the question, modifies its memory, and identifies the word closest to the answer to output as the response. Key aspects of the system include training a question answering engine with Tensorflow, storing and retrieving data, and generating the final answer.
The document describes an intelligent question answering system that can leverage heterogeneous datasets including product manuals, technical forums, and discussion threads. The system includes four main modules: 1) A document parser that can parse different data types including text, images, tables, and forums using deep learning models. 2) A document indexer that indexes documents for retrieval. 3) A document retriever that handles query processing and identifies relevant text segments. 4) A document reader that provides answers by analyzing relevant text segments. The system aims to reduce the time needed to find answers across different data sources by automatically identifying the most relevant information for a given question.
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemIRJET Journal
This document proposes a knowledge graph and question answering system to extract and analyze information from large volumes of unstructured data like annual reports. It discusses using natural language processing techniques like named entity recognition with spaCy and dependency parsing to extract entity-relation pairs from text and construct a knowledge graph. For question answering, it analyzes user queries with similar NLP approaches and then matches query triplets to the knowledge graph to retrieve answers, combining information retrieval and trained classifiers. The proposed system aims to provide faster understanding and analysis of complex, unstructured data for professionals.
A simplified classification computational model of opinion mining using deep ...IJECEIAES
Opinion and attempts to develop an automated system to determine people's viewpoints towards various units such as events, topics, products, services, organizations, individuals, and issues. Opinion analysis from the natural text can be regarded as a text and sequence classification problem which poses high feature space due to the involvement of dynamic information that needs to be addressed precisely. This paper introduces effective modelling of human opinion analysis from social media data subjected to complex and dynamic content. Firstly, a customized preprocessing operation based on natural language processing mechanisms as an effective data treatment process towards building quality-aware input data. On the other hand, a suitable deep learning technique, bidirectional long short term-memory (Bi-LSTM), is implemented for the opinion classification, followed by a data modelling process where truncating and padding is performed manually to achieve better data generalization in the training phase. The design and development of the model are carried on the MATLAB tool. The performance analysis has shown that the proposed system offers a significant advantage in terms of classification accuracy and less training time due to a reduction in the feature space by the data treatment operation.
Application of hidden markov model in question answering systemsijcsa
By the increase of the volume of the saved information on web, Question Answering (QA) systems have been very important for Information Retrieval (IR). QA systems are a specialized form of information retrieval. Given a collection of documents, a Question Answering system attempts to retrieve correct answers to questions posed in natural language. Web QA system is a sample of QA systems that in this system answers retrieval from web environment doing. In contrast to the databases, the saved information on web does not follow a distinct structure and are not generally defined. Web QA systems is the task of automatically answering a question posed in Natural Language Processing (NLP). NLP techniques are used in applications that make queries to databases, extract information from text, retrieve relevant documents from a collection, translate from one language to another, generate text responses, or recognize spoken words converting them into text. To find the needed information on a mass of the non-structured information we have to use techniques in which the accuracy and retrieval factors are implemented well. In this paper in order to well IR in web environment, The QA system in designed and also implemented based on the Hidden Markov Model (HMM)
Deep learning based Arabic short answer grading in serious gamesIJECEIAES
Automatic short answer grading (ASAG) has become part of natural language processing problems. Modern ASAG systems start with natural language preprocessing and end with grading. Researchers started experimenting with machine learning in the preprocessing stage and deep learning techniques in automatic grading for English. However, little research is available on automatic grading for Arabic. Datasets are important to ASAG, and limited datasets are available in Arabic. In this research, we have collected a set of questions, answers, and associated grades in Arabic. We have made this dataset publicly available. We have extended to Arabic the solutions used for English ASAG. We have tested how automatic grading works on answers in Arabic provided by schoolchildren in 6th grade in the context of serious games. We found out those schoolchildren providing answers that are 5.6 words long on average. On such answers, deep learning-based grading has achieved high accuracy even with limited training data. We have tested three different recurrent neural networks for grading. With a transformer, we have achieved an accuracy of 95.67%. ASAG for school children will help detect children with learning problems early. When detected early, teachers can solve learning problems easily. This is the main purpose of this research.
A black-box-approach-for-response-quality-evaluation-of-conversational-agent-...Cemal Ardil
The document discusses the challenges of evaluating conversational agent systems. It proposes a black-box approach to evaluate response quality using observation, classification, and scoring. This approach is used to assess three example systems: AnswerBus, START, and AINI. The evaluation of conversational agents is difficult because existing information retrieval metrics do not apply, and there is no single correct answer for many questions.
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
Abstract—The Information retrieval system is taking an important role in current search engine which performs searching operation based on keywords which results in an enormous amount of data available to the user, from which user cannot figure out the essential and most important information. This limitation may be overcome by a new web architecture known as the semantic web which overcome the limitation of the keyword based search technique called the conceptual or the semantic search technique. Natural language processing technique is mostly implemented in a QA system for asking user’s questions and several steps are also followed for conversion of questions to the query form for retrieving an exact answer. In conceptual search, search engine interprets the meaning of the user’s query and the relation among the concepts that document contains with respect to a particular domain that produces specific answers instead of showing lists of answers. In this paper, we proposed the ontology based semantic information retrieval system and the Jena semantic web framework in which, the user enters an input query which is parsed by Standford Parser then the triplet extraction algorithm is used. For all input queries, the SPARQL query is formed and further, it is fired on the knowledge base (Ontology) which finds appropriate RDF triples in knowledge base and retrieve the relevant information using the Jena framework.
Open domain question answering system using semantic role labelingeSAT Publishing House
1. The document describes a proposed open domain question answering system that uses semantic role labeling to extract answers from documents retrieved from the web.
2. The system consists of three modules: question processing, document retrieval, and answer extraction. Semantic role labeling is used in the answer extraction module to identify answers based on the question type.
3. An evaluation of the proposed system showed it achieved higher accuracy compared to a baseline system using only pattern matching for answer extraction.
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGIRJET Journal
The document describes a proposed method for automatic question generation using natural language processing and T5 text-to-text transfer transformer models. The method uses T5 models trained on the Stanford Question Answering Dataset to generate questions from paragraphs of text without requiring extensive grammar rules. The proposed system aims to assist students in learning by generating questions to test their understanding from provided materials.
Sensing complicated meanings from unstructured data: a novel hybrid approachIJECEIAES
The majority of data on computers nowadays is in the form of unstructured data and unstructured text. The inherent ambiguity of natural language makes it incredibly difficult but also highly profitable to find hidden information or comprehend complex semantics in unstructured text. In this paper, we present the combination of natural language processing (NLP) and convolution neural network (CNN) hybrid architecture called automated analysis of unstructured text using machine learning (AAUT-ML) for the detection of complex semantics from unstructured data that enables different users to make understand formal semantic knowledge to be extracted from an unstructured text corpus. The AAUT-ML has been evaluated using three datasets data mining (DM), operating system (OS), and data base (DB), and compared with the existing models, i.e., YAKE, term frequency-inverse document frequency (TF-IDF) and text-R. The results show better outcomes in terms of precision, recall, and macro-averaged F1-score. This work presents a novel method for identifying complex semantics using unstructured data.
Answer extraction and passage retrieval forWaheeb Ahmed
—Question Answering systems (QASs) do the task of
retrieving text portions from a collection of documents that
contain the answer to the user’s questions. These QASs use a
variety of linguistic tools that be able to deal with small
fragments of text. Therefore, to retrieve the documents which
contains the answer from a large document collections, QASs
employ Information Retrieval (IR) techniques to minimize the
number of documents collections to a treatable amount of
relevant text. In this paper, we propose a model for passage
retrieval model that do this task with a better performance for
the purpose of Arabic QASs. We first segment each the top five
ranked documents returned by the IR module into passages.
Then, we compute the similarity score between the user’s
question terms and each passage. The top five passages (with
high similarity score) are retrieved are retrieved. Finally,
Answer Extraction techniques are applied to extract the final
answer. Our method achieved an average for precision of
87.25%, Recall of 86.2% and F1-measure of 87%.
In this research work we have develop a new scoring mathematical model that works on the five types of questions. The question text failures are first extracted and a score is found based on its structure with respect to its template structure and then answer score is calculated again the question as well as paragraph. Text to finally reach at the index of the most probable answer with respect to question.
IRJET- Visual Question Answering using Combination of LSTM and CNN: A SurveyIRJET Journal
This document discusses using a combination of long short-term memory (LSTM) and convolutional neural networks (CNN) for visual question answering (VQA). It proposes extracting image features from CNNs and encoding question semantics with LSTMs. A multilayer perceptron would then combine the image and question representations to predict answers. The methodology aims to reduce statistical biases in VQA datasets by focusing attention on relevant image regions. It was implemented in Keras with TensorFlow using pre-trained CNNs for images and word embeddings for questions. The proposed approach analyzes local image features and question semantics to improve VQA classification accuracy over methods relying solely on language.
This document describes a proposed AI-based question answering system to generate and evaluate questions for students. Key points:
- The system would generate objective and subjective questions from course materials uploaded by instructors to provide automated assessments.
- It would use natural language processing and machine learning models to understand text, extract keywords to frame questions, and generate answer options for multiple-choice questions.
- Student responses would be evaluated by comparing them to ideal answers using similarity metrics like cosine similarity and Jaccard similarity to provide scores.
- An implementation tested on sample course materials found the model could accurately generate questions and evaluate student knowledge within the limitations of not handling diagrams, math formulations, or subjective literature evaluations.
RoBERTa: language modelling in building Indonesian question-answering systemsTELKOMNIKA JOURNAL
This research aimed to evaluate the performance of the A Lite BERT (ALBERT), efficiently learning an encoder that classifies token replacements accurately (ELECTRA) and a robust optimized BERT pretraining approach (RoBERTa) models to support the development of the Indonesian language question and answer system model. The evaluation carried out used Indonesian, Malay and Esperanto. Here, Esperanto was used as a comparison of Indonesian because it is international, which does not belong to any person or country and this then make it neutral. Compared to other foreign languages, the structure and construction of Esperanto is relatively simple. The dataset used was the result of crawling Wikipedia for Indonesian and Open Super-large Crawled ALMAnaCH coRpus (OSCAR) for Esperanto. The size of the token dictionary used in the test used approximately 30,000 sub tokens in both the SentencePiece and byte-level byte pair encoding methods (ByteLevelBPE). The test was carried out with the learning rates of 1e-5 and 5e-5 for both languages in accordance with the reference from the bidirectional encoder representations from transformers (BERT) paper. As shown in the final result of this study, the ALBERT and RoBERTa models in Esperanto showed the results of the loss calculation that were not much different. This showed that the RoBERTa model was better to implement an Indonesian question and answer system.
Similar to Convolutional recurrent neural network with template based representation for complex question answering (20)
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Neural network optimizer of proportional-integral-differential controller par...IJECEIAES
Wide application of proportional-integral-differential (PID)-regulator in industry requires constant improvement of methods of its parameters adjustment. The paper deals with the issues of optimization of PID-regulator parameters with the use of neural network technology methods. A methodology for choosing the architecture (structure) of neural network optimizer is proposed, which consists in determining the number of layers, the number of neurons in each layer, as well as the form and type of activation function. Algorithms of neural network training based on the application of the method of minimizing the mismatch between the regulated value and the target value are developed. The method of back propagation of gradients is proposed to select the optimal training rate of neurons of the neural network. The neural network optimizer, which is a superstructure of the linear PID controller, allows increasing the regulation accuracy from 0.23 to 0.09, thus reducing the power consumption from 65% to 53%. The results of the conducted experiments allow us to conclude that the created neural superstructure may well become a prototype of an automatic voltage regulator (AVR)-type industrial controller for tuning the parameters of the PID controller.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
A review on features and methods of potential fishing zoneIJECEIAES
This review focuses on the importance of identifying potential fishing zones in seawater for sustainable fishing practices. It explores features like sea surface temperature (SST) and sea surface height (SSH), along with classification methods such as classifiers. The features like SST, SSH, and different classifiers used to classify the data, have been figured out in this review study. This study underscores the importance of examining potential fishing zones using advanced analytical techniques. It thoroughly explores the methodologies employed by researchers, covering both past and current approaches. The examination centers on data characteristics and the application of classification algorithms for classification of potential fishing zones. Furthermore, the prediction of potential fishing zones relies significantly on the effectiveness of classification algorithms. Previous research has assessed the performance of models like support vector machines, naïve Bayes, and artificial neural networks (ANN). In the previous result, the results of support vector machine (SVM) were 97.6% more accurate than naive Bayes's 94.2% to classify test data for fisheries classification. By considering the recent works in this area, several recommendations for future works are presented to further improve the performance of the potential fishing zone models, which is important to the fisheries community.
Electrical signal interference minimization using appropriate core material f...IJECEIAES
As demand for smaller, quicker, and more powerful devices rises, Moore's law is strictly followed. The industry has worked hard to make little devices that boost productivity. The goal is to optimize device density. Scientists are reducing connection delays to improve circuit performance. This helped them understand three-dimensional integrated circuit (3D IC) concepts, which stack active devices and create vertical connections to diminish latency and lower interconnects. Electrical involvement is a big worry with 3D integrates circuits. Researchers have developed and tested through silicon via (TSV) and substrates to decrease electrical wave involvement. This study illustrates a novel noise coupling reduction method using several electrical involvement models. A 22% drop in electrical involvement from wave-carrying to victim TSVs introduces this new paradigm and improves system performance even at higher THz frequencies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
2. Int J Elec & Comp Eng ISSN: 2088-8708
Convolutional recurrent neural network with template based representation... (A. Chandra Obula Reddy)
2711
topic are formulated in question focus [7]. If the user modifies the topic in Community question answering
(CQA), it is shown by the topic variation of question and answer. CQA is a type of information retrieval was
user need from the community is given in the form of natural language question and community response is
in the form of natural language answer. This structure is highly complex in which the properties are sufficient
for constructing QA framework at the presence of structured data sources and domain ontology [8]. Due to
the capability of replacing and extending the components with various implementations, the portability is
obtained at the domain with respect to the targeted domain or ontology [9].
To tackle this issue, wide range of techniques has been developed for answering the questions of
natural language based on heterogeneous and large structure of data [10, 11]. Initially, the exact questions of
any topic will be posted by the data seekers and it allows the other users to share their answers.
By influencing the effects of community, they can able to obtain correct answers with the help of search
engines [12]. Better quality of result is generated with the human intelligence system when compared with
the CQA and automated QA system. The incredible amount of QA pairs has been developed with the data
storage and it provides the search and preservation of evaluated questions [13].
Question answering is the process of finding answering for the question posted in common language
automatically. QA framework requires the capability of understanding natural language linguistics, text and
common knowledge [14]. The correct answer from the given set of documents can be determined by the QA
system by using the question of natural language [15]. Natural Language Processing (NLP) approach
is commonly developed in QA system for replying user’s question. In addition to that, same kinds of
issues are termed uniquely in several QA systems. The natural language overhead like ambiguity,
co-reference, Implicitness, inference etc. created hindrance in sentiment analysis too [16]. Natural language
search, on the other hand, attempts to use natural language processing to understand the nature of
the question and then to search and return a subset of the repositories such as web databases that contains
the answer to the question. The result will be more credible and of higher relevance than results from
a keyword search engine [17].
Recently, the efficient question answering websites are invented to determine the answer
automatically [18]. This kind of question answering system faces some issues during its processing [19].
For resolving the question, certain quantities of questions are waiting and some of them are eliminated by
the user who answers that and the large amount of time is wasted for getting answer. In worst cases,
the suitable answer is not obtained from the experts by the asker [20]. The websites such as question
answering through knowledge sharing is affected by this kind of problems. Hence it is required to detect
the necessary experts for each question, thus it improves the answer quality, minimizing the waiting time and
improves the efficiency of knowledge sharing [21].
However, the issue is that a human can ask a question in variety of ways. To solve this kind of issues
several techniques has been developed with large structured and heterogeneous data for question answering
in natural languages. Previous approaches have natural limits due to their representations: rule based
approaches are utilized for processing the less collection of “canned” questions, while keyword based or
synonym based methods cannot fully understand the questions. Hence template based approaches and better
answer selection is required for efficient question answering system. In this work two parallel Convolutional
Neural Networks (CNNs) are utilized to learn the semantic correlation pattern among question answer pair.
The proposed approach effectively captures the valuable context from the sequence of answers.
The organization of our work in this paper is described as follows. Section 2 describes the proposed
approach of Complex Question Answering. Section 3 explains the research method, template based
convolutional recurrent neural network (T-CRNN) for complex question answering. In Section 4,
the experimental results of T-CRNN are analyzed. Section 5 illustrates the significant aspects of the work
and concludes.
2. THE PROPOSED METHOD
This section describes the proposed approach of complex question answering. For complex question
answering system, Binary factoid questions (BFQs) are taken into account and it has the specific property of
an entity. Initially, the entity from the input question is replaced with the templates. Based on the replaced
entities it uses divide and conquer approach for decompose the question. Then the CNN, RNN and scoring is
used for determining the correct answer. The proposed approach of CQA is shown in Figure 1.
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 2709 - 2718
2712
n
Question
and answer
... ...
Input question
Question representation Knowledge base
Complex question decomposition (Divide and
conquer approach)
CNN
CNN
CNN
CNN
CNN
CNN
RNN RNN RNN
Multi layer perceptron
Scoring (Softmax Classifier)
Semantic
matching
patterns
Context
features
Figure 1. Proposed approach of complex question answering
2.1. Question representation
The question 𝑛 should be represented for answering the question. Question representation is
the process of converting the question from the natural language to its internal representation for capturing
the intents and semantics of that question. Matching is an important concept in the area of semantic web such
as ontology interoperability, data sharing, and document classification etc [22]. For question representation,
the knowledge base is utilized. The internal representation is denoted as template 𝑘, where, 𝑘 = 𝑘(𝑛, 𝑝, 𝛼)
and 𝛼 represents the category. Figure 2 shows the internal representation of question representation.
Each question is considered as a set of entities and the templates from the knowledge base is used to
represent the question.
The entity 𝑝 of the question is replaced by its concept and it can be achieved through the process of
conceptualization. This approach is depending on the network of large semantics (Probase [23]) and it
contains millions of concepts. Hence sufficient granularity is there for representing all kind of questions.
The knowledge base of each predicate is based on several templates. For our work, 27,126,355 templates are
learned for 2782 predicates. The large amount of templates indicates wide coverage for template based CQA.
2.2. Complex question decomposition
For complex question decomposition, divide and conquer approach is used in the proposed CQA
system. Initially, the input question is decomposed into set of BFQ’s, then each are answered separately.
The block diagram for complex question decomposition is shown in Figure 3. The decomposed question
contains modified entities except the first question. After assigning the variable with specific entity which is
the answer of previous question, only the question sequence is materialized.
The answer for the given question is obtained with RDF knowledge base 𝑈 which is in the structure
of (𝑐, 𝑔, 𝑜). Where, 𝑐 denotes the subject, 𝑔 represents the predicate and 𝑜 represents the predicates.
The answer obtained for the question is represented in the form of 𝑁𝑋1 = {(𝑛1, 𝑥1), (𝑛1, 𝑥2), … , (𝑛1, 𝑥 𝑛)}.
Where, 𝑛𝑖 represents the question and 𝑎𝑖 represent the answer for the question 𝑛𝑖. The answer 𝑥1 contains
the exact factoid answer of various sentences.
4. Int J Elec & Comp Eng ISSN: 2088-8708
Convolutional recurrent neural network with template based representation... (A. Chandra Obula Reddy)
2713
Question 1
Entity 1 Representation 1
Entity 2
Entity n
Representation 1
Representation 1
... ...
Question 2
Entity 1 Representation 1
Entity 2
Entity n
Representation 1
Representation 1
... ...
Question n
Entity 1 Representation 1
Entity 2
Entity n
Representation 1
Representation 1
... ...
...
...
Figure 2. Block diagram for question representation
Question 1
Divide and
conquer
Qn. 1
Qn. 1
Qn. 1
Ans. 1
Ans. 1
Ans. 1
...
...
Question 2
Divide and
conquer
Qn. 2
Qn. 2
Qn. 2
Ans. 2
Ans. 2
Ans. 2
...
...
Question n
Divide and
conquer
Qn. n
Qn. n
Qn. n
Ans. n
Ans. n
Ans. n
...
...
...
Figure 3. Block diagram for complex question
decomposition
3. RESEARCH METHOD
This section describes the research method, template based convolutional recurrent neural network
(T-CRNN) for complex question answering.
3.1. Semantic matching pattern
In order to obtain the sematic matching pattern between the question and answer, the CNN is used in
the Research method T-CRNN. The Block diagram of CNN based sentence model is shown in Figure 4.
The CNN is utilized to learn the representation from the QA pair (𝑛, 𝑥𝑡). For that initially, the distributional
representation of question and answer (𝑎 𝑛, 𝑎 𝑥 𝑡
) is learned separately. Then the fixed length representation 𝑏𝑡
is extracted using fully connected hidden layer.
Pre trained word
embedding
(Word to vector)
Convolutional pooling
Convolutional
and pooling
Learnt
sentence
representation
Figure 4. Block diagram of CNN based sentence model
The matrix of pre trained word embedding 𝐻 ∈ 𝑆 𝑔×|𝐹|
is utilized for providing distributional vector
for each word 𝑒𝑖 from the sentence 𝑦 = (𝑒1, … 𝑒𝑖, … 𝑒|𝑠|). Where, the number of vocabulary 𝐹 is denoted as
| 𝐹|, 𝑔 denotes the length of word vector. The input sentence matrix 𝑌 ∈ 𝑆 𝑔×|𝑦|
is constructed and it is given
as an input for convolutional sentence model. The convolutional model contains several convolutional and
pooling layers which provide high level representation of sentence vector. Convolutional Neural Network
(CNN) is a special kind of deep neural network [24].
a. Convolution
From the input sentence matrix Y, the feature map parameters (e(v,j)
, d(v,j)
) are computed in
the convolutional layer and convolutional filter 𝑗. The output for the convolutional filter is the feature map 𝐽.
In the output layer 𝑣, the feature map parameters are in the form of 2 dimensional arrays. The sliding
windows of words with width 𝑙1 are utilized for feature mapping in the first layer of convolutional unit.
The output for the convolutional layer is computed as
𝑟𝑖
(1,𝑗)
≝ 𝑟𝑖
(1,𝑗)
(𝑌) = 𝛿(𝑒(1,𝑗)
𝑟̂𝑖
(0)
+ 𝑑(1,𝑗)
) (1)
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 2709 - 2718
2714
The above equation is in matrix form 𝑟𝑖
(1)
≝ 𝑟𝑖
(1)
(𝑌) = 𝛿(𝑒(1)
𝑟̂𝑖
(0)
+ 𝑑(1)), where 𝑐̂𝑖
(0)
= 𝑒𝑖:𝑖+𝑙1−1
denotes the 𝑙1 words of concatenated vector which is obtained by the sliding windows of convolution.
When the convolution is applied to the deeper layer, it is described as
𝑟𝑖
(𝑣)
= 𝛿(𝐸(𝑣)
𝑟̂𝑖
(𝑣−1)
+ 𝑑(𝑣)
) (2)
where, 𝑟𝑖
(𝑣,𝑗)
denotes the output of feature map, 𝛿(. ) is the activation function (e.g., Sigmoid) and
𝐸(𝑣)
≝ [𝑒(𝑣,1)
, … , 𝑒(𝑣,𝑗)
, … , 𝑒( 𝑣,| 𝐽𝑣)] denotes the parameter of layer 𝑣.
b. Pooling
Max pooling is applied to the output of convolution layer for every two unit feature 𝑗 map window.
It can be formulated as,
𝑟𝑖
(𝑣,𝑗)
= max(𝑟2𝑖−1
(𝑣−1,𝑗)
, 𝑟2𝑖
(𝑣−1,𝑗)
) (3)
The useless word composition is filtered and meaningful semantic or syntactic structure is captured through
pooling operation.
c. Matching
Two convolutional sentence models (𝐶𝑁𝑁 𝑛,. 𝐶𝑁𝑁𝑥) are derived for the given input QA pair (𝑛, 𝑥𝑡)
for getting distributional representation of sentence (𝑎 𝑛, 𝑎 𝑥 𝑡
). It is denoted as:
𝑎 𝑛 = 𝐶𝑁𝑁𝑥(𝑛, 𝐸𝑐𝑛𝑚
𝑛 ) (4)
𝑎 𝑥 𝑡
= 𝐶𝑁𝑁𝑥(𝑥𝑡, 𝐸𝑐𝑛𝑚
𝑥 ) (5)
Here, the parameters of convolutional sentence model are denoted as 𝐸𝑐𝑛𝑛
𝑛
, 𝐸𝑐𝑛𝑛
𝑥
. The sentence vector
(𝑎 𝑛, 𝑎 𝑥 𝑡
) and feature vectors 𝑎 𝑓𝑒𝑎𝑡 are integrated into a concatenated vector 𝐴 𝑡 = [𝑎 𝑛, 𝑎 𝑥 𝑡
, 𝑎 𝑓𝑒𝑎𝑡] using
a joint layer. The fully connected hidden layer is used for learning the representation of QA pair by taking
the concatenated vector.
𝑣 𝑡 ≝ 𝑀𝑡 = 𝛿(𝐸 𝑚 𝑎𝑡 + 𝑑 𝑚) (6)
where, 𝛿(. ) is the activation function, and (𝐸 𝑚, 𝑑 𝑚) denotes the parameters of hidden layer. The output for
the input QA pair (𝑛, 𝑥𝑡) is the learnt representation.
3.2. Context dependent representation with T-CRNN
In order to find the correlation from the sequence of answers the RNN is utilized. There are two
constraints for the correlation among answers. For the subsequent answer, the precedent answer may contain
the context information and the comments for the precedent answer are available in the sub sequent answer.
The bi directional RNN is utilizedfor modelling the semantic correlation between answers and for gathering
the context based representations of the set of answers.
For modeling the basic block for the semantic correlation among answers, the Long Short-Term
Memory (LSTM) unit is used. It is possible to modulate the memory instead of overwriting the stateseach
time. The memory cell 𝑟𝑡 is a key equipment of LSTM unitand its state is varied over time. The LSTM unit
takes decision regardingthe modification and addition of memory in the cell 𝑟𝑡 through sigmoidal unit.
It includes the input unit 𝑖𝑡, output unit 𝑜𝑡 and forget unit 𝑗𝑡.
The QA pair representation 𝑏𝑡 for the current time 𝑡, the memory cell is modified through
the activation of forget unit 𝑗𝑡 and input unit 𝑖𝑡. The updated equation is denoted as follows:
𝑟𝑡 = 𝑗𝑡 𝑟𝑡−1 + 𝑖𝑡 tanh(𝑒 𝑎𝑟 𝑏𝑡 + 𝑒 𝑚𝑟 𝑚𝑡−1 + 𝑑 𝑟) (7)
The context of the LSTM unit is updated by eliminating the unwanted context in forget module 𝑗𝑡
and append the new part from input module 𝑖𝑡. The extensions for modulating these two contexts are
computed as follows:
𝑖𝑡 = 𝛿(𝑒 𝑎𝑖 𝑏𝑡 + 𝑒 𝑚𝑖 𝑚(𝑡−1) + 𝑒 𝑟𝑖 𝑟𝑡−1 + 𝑑𝑖) (8)
𝑗𝑡 = 𝛿(𝑒 𝑎𝑗 𝑏𝑡 + 𝑒 𝑚𝑗 𝑚(𝑡−1) + 𝑒 𝑟𝑗 𝑟𝑡−1 + 𝑑𝑗) (9)
6. Int J Elec & Comp Eng ISSN: 2088-8708
Convolutional recurrent neural network with template based representation... (A. Chandra Obula Reddy)
2715
For the updated cell state 𝑟𝑡, the representation 𝑚𝑡 and the output 𝑜𝑡 from the LSTM unit are computed as,
𝑜𝑡 = 𝛿(𝑒 𝑎𝑜 𝑏𝑡 + 𝑒 𝑚𝑜 𝑚 𝑡−1 + 𝑒 𝑟𝑜 𝑡𝑡 + 𝑑 𝑜) (10)
𝑚𝑡 = 𝑜𝑡 tanh(𝑟𝑡) (11)
where, 𝑒 𝑎∗ ∈ 𝑆 𝑧 𝑚×𝑧𝑖, 𝑒 𝑟∗ ∈ 𝑆 𝑧 𝑚×𝑧 𝑚 , (𝑒 ∗, 𝑑 ∗) represents the LSTM unit parameters, 𝛿(. ) represents
the sigmoid activation function, 𝑧 𝑚 is the dimension of hidden layer, 𝑧𝑖 denotes the dimension of QZ pair
representation and 𝑒 𝑟𝑗, 𝑒 𝑟𝑖, 𝑒 𝑟𝑜 are diagonal matrices.
The bi directional RNNs are utilized for parallely learning the QA pairs from the QA pair
representation 𝐺𝐾 = [𝑏1,… , 𝑏𝑡,… , 𝑏|𝐺𝐾|] in forward and reverse direction. It uses both future and previous
context in the answer sequence. The bi directional RNNs (𝑅𝑁⃗⃗ 𝑁, 𝑅𝑁⃖⃗⃗ 𝑁) are utilized in addition with the fixed
representation 𝑏𝑡 of specific QA pair (𝑛, 𝑥𝑡) for obtaining the context dependent representation.
𝑚𝑡⃗⃗⃗⃗⃗ = 𝑅𝑁⃗⃗ 𝑁(𝑏𝑡, 𝑚𝑡−1⃗⃗⃗⃗⃗⃗⃗⃗⃗ , 𝐸𝑟𝑛𝑚
⃗⃗⃗⃗⃗⃗⃗⃗⃗ ) (12)
𝑚𝑡⃖⃗⃗⃗⃗⃗ = 𝑅𝑁⃖⃗⃗ 𝑁(𝑏𝑡, 𝑚𝑡−1⃖⃗⃗⃗⃗⃗⃗⃗⃗⃗, 𝐸𝑟𝑛𝑚
⃖⃗⃗⃗⃗⃗⃗⃗⃗⃗) (13)
where 𝐸𝑟𝑛𝑛
⃖⃗⃗⃗⃗⃗⃗⃗⃗, 𝐸𝑟𝑛𝑛
⃗⃗⃗⃗⃗⃗⃗⃗ represents the concatenate parameters of two directional RNN. From the learning of bi
directional sequence the learnt representation is combined and the context dependent representation 𝑚𝑡 is
computed for the QA pair (𝑛, 𝑥𝑡).
𝑚𝑡 = [
𝑚𝑡⃗⃗⃗⃗⃗
𝑚𝑡⃖⃗⃗⃗⃗⃗
] (14)
In order to compute the score 𝑠𝑐𝑜 = [𝑜1, … , 𝑜𝑖, … , 𝑜|𝑅|] over the answer class 𝑅, 𝑚𝑡 is given into
the scoring process. It can be computed as,
𝑠𝑐𝑜 = 𝛿(𝐸 𝑤 𝑚𝑡 + 𝑑 𝑤) (15)
where 𝐸 𝑤 ∈ 𝑆 𝑧 𝑟×𝑧 𝑜, (𝐸 𝑤, 𝑑 𝑤) denotes the scoring parameter, 𝑧 𝑟 represents the number of answer classes, and
𝑧 𝑜 denotes the final representation of 𝑚𝑡. The prediction for the current answer 𝑥𝑡 is computed with
the softmax function by using 𝑠𝑐𝑜.
4. RESULTS AND DISCUSSION
The dataset used for our work is Factoid Q&A Corpus [25] which contains 1,714 manually-created
factoid questions and their relevant answers collected by the University of Pittsburgh and Carnegie Mellon
University between 2008 and 2010. The configurations used in our proposed T-CRNN are described as
follows. For modelling both question and answer, the CNN is constructed with 3 convolutional and pooling
layers. The number of feature map utilized for each layer is 100. The convolution window size of each layer
is 5x100, 4x1, 2x1 and the pooling window size is 4x100, 3x1, 3x1. This configuration is also denoted as
(5,4,2) and (4,3,3). The RNN is modelled with 2 LSTM units and the size of all LSTM units is set to 360.
Pre-trained word embedding is used with the dimension of 100. For each sentence, the maximum length is set
to 100.The proposed T-CRNN is compared with the existing approaches such as JAIST, ICRC and RCNN.
The performance of the proposed work is validated with performance measures. The increased value of these
measures shows percentage of improvement. The performance measures such as precision, recall, F-measure
and accuracy are estimated.
4.1. Precision
Precision computes the amount of positive instance among the total amount of retrieved instances.
It is also known as positive predictive value. It is also termed as sensitivity. The precision value comparisons
with different approaches for varying the number of documents are shown in Figure 5. For the existing
approaches, the precision is in the range between 0.4 and 0.6 when varying the number of documents to 300,
500, 700 and 1000. The proposed approach reaches the precision level up to 0.98. When the number of
documents is increased, then there is degradation in precision value. The existing approaches provide equal
performance and it is less than 0.6. The improved value of T-CRNN shows the efficiency of the proposed
approach. The values obtained for the proposed approach is 0.98, 0.97, 0.95 and 0.93. The average precision
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 2709 - 2718
2716
obtained by RCNN, ICRC, JAIST, A-ARC I and T-CRNN are 0.545, 0.545, 0.547, 0.57 and 0.95
respectively. The improved precision shows better performance for the proposed approach. The average
precision measure is shown in Figure 6.
Figure 5. Precision comparison for varying the
number of documents
Figure 6. Average precision value for various CQA
approaches
4.2. Recall
Recall computes the retrieved amount of relevant instance among the total amount of relevant
instances. The conventional approach achieves the recall values in the range between 0.55 and 0.6.
The proposed approach improves the recall result up to 0.88. For varying the number of documents, the recall
value obtained for the proposed approach is 0.88, 0.86, 0.85 and 0.8. The lower value is obtained with ICRC
and RCNN. The remaining conventional approaches achieve the value similar to the ICRC with slight
variation. The average recall value is computed by taking average for the number of documents used for
the work. The average recall in the range was between 0.5 and 0.55 for the conventional approach.
The proposed approach can achieve the recall between 0.8 and 0.85. The comparison of recall result is shown
in Figures 7 and 8.
Figure 7. Recall comparison for varying the number of
documents
Figure 8. Average recall value for various CQA
approaches
4.3. F-measure
The F-measure is computed as a harmonic mean between precision and recall. It is also termed as
balanced F- score. It determines the average of both precision and recall when they are close. Figures 9
and 10 show the proposed f-measure comparison by varying the number of documents and for different
approaches. For the proposed method it achieves the value between 0.85 and 0.95. The conventional
approaches are in the range of 0.45 and 0.57. The values obtained for f-measure computation is 0.927, 0.911,
0.897 and 0.86. The average f-measure value for RCNN, ICRC, JAIST, A-ARC I and T-CRNN is 0.548,
0.516, 0.555, 0.565 and 0.898.
8. Int J Elec & Comp Eng ISSN: 2088-8708
Convolutional recurrent neural network with template based representation... (A. Chandra Obula Reddy)
2717
Figure 9. F-measure comparison for varying
the number of documents
Figure 10. F-measure recall value for various CQA
approaches
4.4. Accuracy
The accuracy of the proposed work is estimated based on the amount of correct answer selection.
The accuracy is calculated with the number of documents as 300, 500, 700 and 1000. The accuracy of 300
documents is higher than the accuracy of the system with 1000 documents. This shows the increase in
the number of documents reduces the accuracy of the CQA system. The accuracy comparison for varying
the number of document is shown in Figure 11. The accuracy of proposed T-CRNN is higher than other
conventional approaches. The average accuracy for the proposed approach is higher than the conventional
approaches is shown in Figure 12. The precision, recall, f-measure and accuracy result comparison shows
the improved performance of the proposed CQA system.
Figure 11. Accuracy comparison for varying
the number of documents
Figure 12. Average accuracy for various CQA
approaches
5. CONCLUSION
In this work T-CRNN is proposed for answer selection in complex question answering system.
Initially, the given input question is decomposed into several questions. It is accomplished with replacing
the each entity of the question with the template through the use of knowledge base. Then the sentence
matrix of question and answer pair is created and it is represented in vector form by the approach of
pre-trained word embedding. The semantic matching pattern between question and answer is obtained with
CNN. The semantic correlation among sequence of answers is achieved with RNN. The scoring for each
answer is computed with multi-layer perceptron and softmax classifier.The proposed approach is evaluated
with the performance measures such as precision, recall, f-measure and accuracy. The performance of
the proposed approach is evaluated with the conventional methods and it shows the efficiency of
the proposed CQA system.
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 10, No. 3, June 2020 : 2709 - 2718
2718
REFERENCES
[1] M. Sarrouti and S.E.A. Ouatik, "A passage retrieval method based on probabilistic information retrieval and UMLS
concepts in biomedical question answering," Journal of Biomedical Informatics, vol. 68, pp. 96-103, 2017.
[2] F.S. Utomo, et al., "New instances classification framework on quran ontology applied to question
answering system," TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 17, no. 1,
pp. 139-146, 2019.
[3] A.I. Adekitan, et al., "Determining the operational status of a three phase induction motor using a predictive
data mining mode," International Journal of Power Electronics and Drive System (IJPEDS), vol. 10, no. 1,
pp. 93-103, 2019.
[4] O. Kolomiyets and M.F. Moens, "A survey on question answering technology from an information retrieval
perspective," Information Sciences, vol. 181, pp. 5412-5434, 2011.
[5] U. Ungkawa, et al., "Methodology for constructing form ontology," Proceeding of the Electrical Engineering,
Computer Science and Informatics (EECSI), vol. 4, pp. 696-701, 2017.
[6] P. Moreda, et al., "Combining semantic information in question answering systems," Information Processing &
Management, vol. 47, pp. 870-885, 2011.
[7] R. Hong, et al., "Multimedia question answering," IEEE Multimedia, vol. 19, no. 4, pp. 72-78, 2012.
[8] S. Geerthik, et al., "AnswerRank: Identifying Right Answers in QA system," International Journal of Electrical
and Computer Engineering (IJECE), vol. 6, no. 4, pp. 1889-1896, 2016.
[9] O. Ferrandez, et al., "The QALL-ME framework: A specifiable-domain multilingual question answering
architecture," Web semantics: Science, services and agents on the world wide web, vol. 9, no. 2, pp. 137-145, 2011.
[10] C. Unger, et al., "An introduction to question answering over linked data," Springer International Publishing,
pp. 100-140, 2014.
[11] F. Liu, et al., "Toward automated consumer question answering: Automatically separating consumer
questions from professional questions in the healthcare domain," Journal of biomedical informatics, vol. 44,
pp. 1032-1038, 2011.
[12] L. Nie, et al., "Beyond text QA: multimedia answer generation by harvesting web information," IEEE Transactions
on Multimedia, vol. 15, no. 2, pp. 426-441, 2013.
[13] S.D. Patrick and M.F. Moens, "Knowledge and reasoning for question answering: Research perspectives,"
Information Processing & Management, vol. 47, pp. 899-906, 2011.
[14] S. Ivan and M. Bielikova, "A comprehensive survey and classification of approaches for community question
answering," ACM Transactions on the Web, vol. 10, no. 3, pp. 1-63, 2016.
[15] A. Konys, "Knowledge-based approach to question answering system selection," Springer, pp. 361-370, 2015.
[16] D. Asmita, et al., "Review on Opinion Mining for Fully Fledged System," Indonesian Journal of Electrical
Engineering and Informatics (IJEEI), vol. 4, no. 2, pp. 141-148, 2016.
[17] T. Enikuomehin and J.S. Sadiku, "Text Wrapping Approach to natural Language Information retrieval using
significant Indicator," IAES International Journal of Artificial Intelligence (IJ_AI), vol. 2, no. 3, pp. 136-142, 2013.
[18] A. Pal, et al., "Exploring question selection bias to identify experts and potential experts in community question
answering," ACM Transactions on Information Systems, vol. 30, no. 2, pp. 1-28, 2012.
[19] D.R. Liu, et al., "Integrating expert profile, reputation and link analysis for expert finding in question-answering
websites," Information processing & management, vol. 49, pp. 312-329, 2013.
[20] Z. Zhao, et al., "Expert finding for question answering via graph regularized matrix completion," IEEE
Transactions on Knowledge and Data Engineering, vol. 27, no. 4, pp. 993-1004, 2015.
[21] A. Barrera, et al., "SemQuest: University of Houston's Semantics-based Question Answering System," Research
gate, pp. 1-9, 2011.
[22] A.M. Bagiwa and S. B. Junaidu, "A Conceptual Framework for Adding Textual Annotations to Hierarchical Trees
Representing Ontologies as A means Of Achieving Semantic Interoperability between Different Ontologies of
the Same Domain," International journal of electrical, electronics and computer systems (IJEECS), vol. 1, no. 1,
pp. 1-5, 2011.
[23] W. Wu, et al., "Probase: A probabilistic taxonomy for text understanding," ACM., pp. 481-492, 2012.
[24] H.F. Pardede, et al., "Convolutional Neural Network and Feature Transformation for Distant Speech Recognition,"
International Journal of Electrical and Computer Engineering (IJECE), vol. 8, no. 6, pp. 5381- 5388, 2018.
[25] Z. Yang, et al., "Finding the right social media site for questions," IEEE/ACM, pp. 639-644, 2015.