The document compares two supervised machine learning algorithms, Naive Bayes and decision trees, for the task of word sense disambiguation (WSD) using an empirical approach. It describes implementing both algorithms on a dataset of 15 English words annotated with senses from WordNet. Naive Bayes achieved an accuracy of 62.86% on the Senseval-3 test, while decision trees achieved 45.14% accuracy. The document analyzes and compares the performance of the two approaches to determine which is most successful for WSD and how their combination could potentially improve accuracy.
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
Speaker identification is one of the most important technologies nowadays. Many fields such as
bioinformatics and security are using speaker identification. Also, almost all electronic devices are using
this technology too. Based on number of text, speaker identification divided into text dependent and text
independent. On many fields, text independent is mostly used because number of text is unlimited. So, text
independent is generally more challenging than text dependent. In this research, speaker identification text
independent with Indonesian speaker data was modelled with Vector Quantization (VQ). In this research
VQ with K-Means initialization was used. K-Means clustering also was used to initialize mean and
Hierarchical Agglomerative Clustering was used to identify K value for VQ. The best VQ accuracy was
59.67% when k was 5. According to the result, Indonesian language could be modelled by VQ. This
research can be developed using optimization method for VQ parameters such as Genetic Algorithm or
Particle Swarm Optimization.
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATIONijscai
Online reviews are a feedback to the product and play a key role in improving the product to cater to consumers. Online reviews that rely heavily on manual categorization are time consuming and labor intensive.The recurrent neural network in deep learning can process time series data, while the long and short term memory network can process long time sequence data well. This has good experimental verification support in natural language processing, machine translation, speech recognition and language model.The merits of the extracted data features affect the classification results produced by the classification model. The LDA topic model adds a priori a posteriori knowledge to classify the data so that the characteristics of the data can be extracted efficiently.Applied to the classifier can improve accuracy and efficiency. Two-way long-term and short-term memory networks are variants and extensions of cyclic neural networks.The deep learning framework Keras uses Tensorflow as the backend to build a convenient two-way long-term and short-term memory network model, which provides a strong technical support for the experiment.Using the LDA topic model to extract the keywords needed to train the neural network and increase the internal relationship between words can improve the learning efficiency of the model. The experimental results in the same experimental environment are better than the traditional word frequency features.
Chunking means splitting the sentences into tokens and then grouping them in a meaningful way. When it comes to high-performance chunking systems, transformer models have proved to be the state of the art benchmarks. To perform chunking as a task it requires a large-scale high quality annotated corpus where each token is attached with a particular tag similar as that of Named Entity Recognition Tasks. Later these tags are used in conjunction with pointer frameworks to find the final chunk. To solve this for a specific domain problem, it becomes a highly costly affair in terms of time and resources to manually annotate and produce a large-high-quality training set. When the domain is specific and diverse, then cold starting becomes even more difficult because of the expected large number of manually annotated queries to cover all aspects. To overcome the problem, we applied a grammar-based text generation mechanism where instead of annotating a sentence we annotate using grammar templates. We defined various templates corresponding to different grammar rules. To create a sentence we used these templates along with the rules where symbol or terminal values were chosen from the domain data catalog. It helped us to create a large number of annotated queries. These annotated queries were used for training the machine learning model using an ensemble transformer-based deep neural network model [24.] We found that grammar-based annotation was useful to solve domain-based chunks in input query sentences without any manual annotation where it was found to achieve a classification F1 score of 96.97% in classifying the tokens for the out of template queries.
Sentiment Analysis In Myanmar Language Using Convolutional Lstm Neural Networkkevig
In recent years, there has been an increasing use of social media among people in Myanmar and writing
review on social media pages about the product, movie, and trip are also popular among people. Moreover,
most of the people are going to find the review pages about the product they want to buy before deciding
whether they should buy it or not. Extracting and receiving useful reviews over interesting products is very
important and time consuming for people. Sentiment analysis is one of the important processes for extracting
useful reviews of the products. In this paper, the Convolutional LSTM neural network architecture is
proposed to analyse the sentiment classification of cosmetic reviews written in Myanmar Language. The
paper also intends to build the cosmetic reviews dataset for deep learning and sentiment lexicon in Myanmar
Language.
SENSE DISAMBIGUATION TECHNIQUE FOR PROVIDING MORE ACCURATE RESULTS IN WEB SEARCHijwscjournal
As the web is increasing exponentially, so it is very much difficult to provide relevant information to the information seekers. While searching some information on the web, users can easily fade out in rich hypertext. The existing techniques provide the results that are not up to the mark. This paper focuses on the technique that helps in offering more accurate results, especially in case of Homographs. Homograph is a word that shares the same written form but has different meanings. The technique that shows how senses of words can play an important role in offering accurate search results, is described in following sections. While adopting this technique user can receive only relevant pages on the top of the search result.
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
Speaker identification is one of the most important technologies nowadays. Many fields such as
bioinformatics and security are using speaker identification. Also, almost all electronic devices are using
this technology too. Based on number of text, speaker identification divided into text dependent and text
independent. On many fields, text independent is mostly used because number of text is unlimited. So, text
independent is generally more challenging than text dependent. In this research, speaker identification text
independent with Indonesian speaker data was modelled with Vector Quantization (VQ). In this research
VQ with K-Means initialization was used. K-Means clustering also was used to initialize mean and
Hierarchical Agglomerative Clustering was used to identify K value for VQ. The best VQ accuracy was
59.67% when k was 5. According to the result, Indonesian language could be modelled by VQ. This
research can be developed using optimization method for VQ parameters such as Genetic Algorithm or
Particle Swarm Optimization.
THE EFFECTS OF THE LDA TOPIC MODEL ON SENTIMENT CLASSIFICATIONijscai
Online reviews are a feedback to the product and play a key role in improving the product to cater to consumers. Online reviews that rely heavily on manual categorization are time consuming and labor intensive.The recurrent neural network in deep learning can process time series data, while the long and short term memory network can process long time sequence data well. This has good experimental verification support in natural language processing, machine translation, speech recognition and language model.The merits of the extracted data features affect the classification results produced by the classification model. The LDA topic model adds a priori a posteriori knowledge to classify the data so that the characteristics of the data can be extracted efficiently.Applied to the classifier can improve accuracy and efficiency. Two-way long-term and short-term memory networks are variants and extensions of cyclic neural networks.The deep learning framework Keras uses Tensorflow as the backend to build a convenient two-way long-term and short-term memory network model, which provides a strong technical support for the experiment.Using the LDA topic model to extract the keywords needed to train the neural network and increase the internal relationship between words can improve the learning efficiency of the model. The experimental results in the same experimental environment are better than the traditional word frequency features.
Chunking means splitting the sentences into tokens and then grouping them in a meaningful way. When it comes to high-performance chunking systems, transformer models have proved to be the state of the art benchmarks. To perform chunking as a task it requires a large-scale high quality annotated corpus where each token is attached with a particular tag similar as that of Named Entity Recognition Tasks. Later these tags are used in conjunction with pointer frameworks to find the final chunk. To solve this for a specific domain problem, it becomes a highly costly affair in terms of time and resources to manually annotate and produce a large-high-quality training set. When the domain is specific and diverse, then cold starting becomes even more difficult because of the expected large number of manually annotated queries to cover all aspects. To overcome the problem, we applied a grammar-based text generation mechanism where instead of annotating a sentence we annotate using grammar templates. We defined various templates corresponding to different grammar rules. To create a sentence we used these templates along with the rules where symbol or terminal values were chosen from the domain data catalog. It helped us to create a large number of annotated queries. These annotated queries were used for training the machine learning model using an ensemble transformer-based deep neural network model [24.] We found that grammar-based annotation was useful to solve domain-based chunks in input query sentences without any manual annotation where it was found to achieve a classification F1 score of 96.97% in classifying the tokens for the out of template queries.
Sentiment Analysis In Myanmar Language Using Convolutional Lstm Neural Networkkevig
In recent years, there has been an increasing use of social media among people in Myanmar and writing
review on social media pages about the product, movie, and trip are also popular among people. Moreover,
most of the people are going to find the review pages about the product they want to buy before deciding
whether they should buy it or not. Extracting and receiving useful reviews over interesting products is very
important and time consuming for people. Sentiment analysis is one of the important processes for extracting
useful reviews of the products. In this paper, the Convolutional LSTM neural network architecture is
proposed to analyse the sentiment classification of cosmetic reviews written in Myanmar Language. The
paper also intends to build the cosmetic reviews dataset for deep learning and sentiment lexicon in Myanmar
Language.
SENSE DISAMBIGUATION TECHNIQUE FOR PROVIDING MORE ACCURATE RESULTS IN WEB SEARCHijwscjournal
As the web is increasing exponentially, so it is very much difficult to provide relevant information to the information seekers. While searching some information on the web, users can easily fade out in rich hypertext. The existing techniques provide the results that are not up to the mark. This paper focuses on the technique that helps in offering more accurate results, especially in case of Homographs. Homograph is a word that shares the same written form but has different meanings. The technique that shows how senses of words can play an important role in offering accurate search results, is described in following sections. While adopting this technique user can receive only relevant pages on the top of the search result.
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf
Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text
classification. In this paper, Fast Fuzzy Feature clustering for text classification is proposed. It
is based on the framework proposed by Jung-Yi Jiang, Ren-Jia Liou and Shie-Jue Lee in 2011.
The word in the feature vector of the document is grouped into the cluster in less iteration. The
numbers of iterations required to obtain cluster centers are reduced by transforming clusters
center dimension from n-dimension to 2-dimension. Principle Component Analysis with slit
change is used for dimension reduction. Experimental results show that, this method improve
the performance by significantly reducing the number of iterations required to obtain the cluster
center. The same is being verified with three benchmark datasets
MODELLING OF INTELLIGENT AGENTS USING A–PROLOGijaia
Nowadays, research in artificial intelligence has widely grown in areas such as knowledge representation,
goal-directed behaviour and knowledge reusability, all of them directly relevant to improving intelligent
agents in computer games. In a particular way, we focus on the development of a novel algorithm that
allow an agent to combine fundamental theories such as reasoning, learning and simulation. This
algorithm combines a system of logical rules and a simulation mechanism based on learning that makes
our agent has an infallible mechanism for decision-making into the game “connect four”. The logic system
is developed in a modelling language known as Answer Set Programming or A–Prolog. This paradigm is
the integration of two well-known languages, namely Prolog and ASP.
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELijaia
Recent development of generative pretrained language models has been proven very successful on a wide
range of NLP tasks, such as text classification, question answering, textual entailment and so on. In this
work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding
Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by
both automatic metrics and human annotators, and demonstrated that the architecture achieves the stateof-the-art comparable result on large scale corpus – ‘CNN/Daily Mail1
As the best of our knowledge’, this
is the first work that applies BERT based architecture to a text summarization task and achieved the stateof-the-art comparable result.
Extractive Summarization with Very Deep Pretrained Language Modelgerogepatton
Recent development of generative pretrained language models has been proven very successful on a wide range of NLP tasks, such as text classification, question answering, textual entailment and so on.In this work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by both automatic metrics and human annotators, and demonstrated that the architecture achieves the state-of-the-art comparable result on large scale corpus - CNN/Daily Mail1. As the best of our knowledge, this is the first work that applies BERT based architecture to a text summarization task and achieved the state-of-the-art comparable result.
Document Classification Using KNN with Fuzzy Bags of Word Representationsuthi
Abstract — Text classification is used to classify the documents depending on the words, phrases and word combinations according to the declared syntaxes. There are many applications that are using text classification such as artificial intelligence, to maintain the data according to the category and in many other. Some keywords which are called topics are selected to classify the given document. Using these Topics the main idea of the document can be identified. Selecting the Topics is an important task to classify the document according to the category. In this proposed system keywords are extracted from documents using TF-IDF and Word Net. TF-IDF algorithm is mainly used to select the important words by which document can be classified. Word Net is mainly used to find similarity between these candidate words. The words which are having the maximum similarity are considered as Topics(keywords). In this experiment we used TF-IDF model to find the similar words so that to classify the document. Decision tree algorithm gives the better accuracy for text classification when compared to other algorithms fuzzy system to classify text written in natural language according to topic. It is necessary to use a fuzzy classifier for this task, due to the fact that a given text can cover several topics with different degrees. In this context, traditional classifiers are inappropriate, as they attempt to sort each text in a single class in a winner-takes-all fashion. The classifier we proposeautomatically learns its fuzzy rules from training examples. We have applied it to classify news articles, and the results we obtained are promising. The dimensionality of a vector is very important in text classification. We can decrease this dimensionality by using clustering based on fuzzy logic. Depending on the similarity we can classify the document and thus they can be formed into clusters according to their Topics. After formation of clusters one can easily access the documents and save the documents very easily. In this we can find the similarity and summarize the words called Topics which can be used to classify the Documents.
WARRANTS GENERATIONS USING A LANGUAGE MODEL AND A MULTI-AGENT SYSTEMijnlc
Each argument begins with a conclusion, which is followed by one or more premises supporting the
conclusion. The warrant is a critical component of Toulmin's argument model; it explains why the premises
support the claim. Despite its critical role in establishing the claim's veracity, it is frequently omitted or left
implicit, leaving readers to infer. We consider the problem of producing more diverse and high-quality
warrants in response to a claim and evidence. To begin, we employ BART [1] as a conditional sequence tosequence language model to guide the output generation process. On the ARCT dataset [2], we fine-tune
the BART model. Second, we propose the Multi-Agent Network for Warrant Generation as a model for
producing more diverse and high-quality warrants by combining Reinforcement Learning (RL) and
Generative Adversarial Networks (GAN) with the mechanism of mutual awareness of agents. In terms of
warrant generation, our model generates a greater variety of warrants than other baseline models. The
experimental results validate the effectiveness of our proposed hybrid model for generating warrants.
LOG MESSAGE ANOMALY DETECTION WITH OVERSAMPLINGijaia
Imbalanced data is a significant challenge in classification with machine learning algorithms. This is particularly important with log message data as negative logs are sparse so this data is typically imbalanced. In this paper, a model to generate text log messages is proposed which employs a SeqGAN network. An Autoencoder is used for feature extraction and anomaly detection is done using a GRU network. The proposed model is evaluated with three imbalanced log data sets, namely BGL, OpenStack, and Thunderbird. Results are presented which show that appropriate oversampling and data balancing
improves anomaly detection accuracy.
SEMI-SUPERVISED BOOTSTRAPPING APPROACH FOR NAMED ENTITY RECOGNITIONkevig
The aim of Named Entity Recognition (NER) is to identify references of named entities in unstructured documents, and to classify them into pre-defined semantic categories. NER often aids from added background knowledge in the form of gazetteers. However using such a collection does not deal with name variants and cannot resolve ambiguities associated in identifying the entities in context and associating them with predefined categories. We present a semi-supervised NER approach that starts with identifying named entities with a small set of training data. Using the identified named entities, the word and the context features are used to define the pattern. This pattern of each named entity category is used as a seed pattern to identify the named entities in the test set. Pattern scoring and tuple value score enables the generation of the new patterns to identify the named entity categories. We have evaluated the proposed system for English language with the dataset of tagged (IEER) and untagged (CoNLL 2003) named entity corpus and for Tamil language with the documents from the FIRE corpus and yield an average f-measure of 75% for both the languages.
In non-parametric statistics, a kernel is a weighting function used in non-parametric estimation techniques. A kernel is a non-negative real-valued symmetric and integrable function K. Several types of kernel functions are commonly used: uniform, triangle, Epanechnikov, quartic (biweight), tricube, triweight, Gaussian, quadratic and cosine. In this presentation we will talk about the properties and applications of kernel functions.
Supervised WSD Using Master- Slave Voting Techniqueiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf
Feature clustering is a powerful method to reduce the dimensionality of feature vectors for text
classification. In this paper, Fast Fuzzy Feature clustering for text classification is proposed. It
is based on the framework proposed by Jung-Yi Jiang, Ren-Jia Liou and Shie-Jue Lee in 2011.
The word in the feature vector of the document is grouped into the cluster in less iteration. The
numbers of iterations required to obtain cluster centers are reduced by transforming clusters
center dimension from n-dimension to 2-dimension. Principle Component Analysis with slit
change is used for dimension reduction. Experimental results show that, this method improve
the performance by significantly reducing the number of iterations required to obtain the cluster
center. The same is being verified with three benchmark datasets
MODELLING OF INTELLIGENT AGENTS USING A–PROLOGijaia
Nowadays, research in artificial intelligence has widely grown in areas such as knowledge representation,
goal-directed behaviour and knowledge reusability, all of them directly relevant to improving intelligent
agents in computer games. In a particular way, we focus on the development of a novel algorithm that
allow an agent to combine fundamental theories such as reasoning, learning and simulation. This
algorithm combines a system of logical rules and a simulation mechanism based on learning that makes
our agent has an infallible mechanism for decision-making into the game “connect four”. The logic system
is developed in a modelling language known as Answer Set Programming or A–Prolog. This paradigm is
the integration of two well-known languages, namely Prolog and ASP.
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELijaia
Recent development of generative pretrained language models has been proven very successful on a wide
range of NLP tasks, such as text classification, question answering, textual entailment and so on. In this
work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding
Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by
both automatic metrics and human annotators, and demonstrated that the architecture achieves the stateof-the-art comparable result on large scale corpus – ‘CNN/Daily Mail1
As the best of our knowledge’, this
is the first work that applies BERT based architecture to a text summarization task and achieved the stateof-the-art comparable result.
Extractive Summarization with Very Deep Pretrained Language Modelgerogepatton
Recent development of generative pretrained language models has been proven very successful on a wide range of NLP tasks, such as text classification, question answering, textual entailment and so on.In this work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by both automatic metrics and human annotators, and demonstrated that the architecture achieves the state-of-the-art comparable result on large scale corpus - CNN/Daily Mail1. As the best of our knowledge, this is the first work that applies BERT based architecture to a text summarization task and achieved the state-of-the-art comparable result.
Document Classification Using KNN with Fuzzy Bags of Word Representationsuthi
Abstract — Text classification is used to classify the documents depending on the words, phrases and word combinations according to the declared syntaxes. There are many applications that are using text classification such as artificial intelligence, to maintain the data according to the category and in many other. Some keywords which are called topics are selected to classify the given document. Using these Topics the main idea of the document can be identified. Selecting the Topics is an important task to classify the document according to the category. In this proposed system keywords are extracted from documents using TF-IDF and Word Net. TF-IDF algorithm is mainly used to select the important words by which document can be classified. Word Net is mainly used to find similarity between these candidate words. The words which are having the maximum similarity are considered as Topics(keywords). In this experiment we used TF-IDF model to find the similar words so that to classify the document. Decision tree algorithm gives the better accuracy for text classification when compared to other algorithms fuzzy system to classify text written in natural language according to topic. It is necessary to use a fuzzy classifier for this task, due to the fact that a given text can cover several topics with different degrees. In this context, traditional classifiers are inappropriate, as they attempt to sort each text in a single class in a winner-takes-all fashion. The classifier we proposeautomatically learns its fuzzy rules from training examples. We have applied it to classify news articles, and the results we obtained are promising. The dimensionality of a vector is very important in text classification. We can decrease this dimensionality by using clustering based on fuzzy logic. Depending on the similarity we can classify the document and thus they can be formed into clusters according to their Topics. After formation of clusters one can easily access the documents and save the documents very easily. In this we can find the similarity and summarize the words called Topics which can be used to classify the Documents.
WARRANTS GENERATIONS USING A LANGUAGE MODEL AND A MULTI-AGENT SYSTEMijnlc
Each argument begins with a conclusion, which is followed by one or more premises supporting the
conclusion. The warrant is a critical component of Toulmin's argument model; it explains why the premises
support the claim. Despite its critical role in establishing the claim's veracity, it is frequently omitted or left
implicit, leaving readers to infer. We consider the problem of producing more diverse and high-quality
warrants in response to a claim and evidence. To begin, we employ BART [1] as a conditional sequence tosequence language model to guide the output generation process. On the ARCT dataset [2], we fine-tune
the BART model. Second, we propose the Multi-Agent Network for Warrant Generation as a model for
producing more diverse and high-quality warrants by combining Reinforcement Learning (RL) and
Generative Adversarial Networks (GAN) with the mechanism of mutual awareness of agents. In terms of
warrant generation, our model generates a greater variety of warrants than other baseline models. The
experimental results validate the effectiveness of our proposed hybrid model for generating warrants.
LOG MESSAGE ANOMALY DETECTION WITH OVERSAMPLINGijaia
Imbalanced data is a significant challenge in classification with machine learning algorithms. This is particularly important with log message data as negative logs are sparse so this data is typically imbalanced. In this paper, a model to generate text log messages is proposed which employs a SeqGAN network. An Autoencoder is used for feature extraction and anomaly detection is done using a GRU network. The proposed model is evaluated with three imbalanced log data sets, namely BGL, OpenStack, and Thunderbird. Results are presented which show that appropriate oversampling and data balancing
improves anomaly detection accuracy.
SEMI-SUPERVISED BOOTSTRAPPING APPROACH FOR NAMED ENTITY RECOGNITIONkevig
The aim of Named Entity Recognition (NER) is to identify references of named entities in unstructured documents, and to classify them into pre-defined semantic categories. NER often aids from added background knowledge in the form of gazetteers. However using such a collection does not deal with name variants and cannot resolve ambiguities associated in identifying the entities in context and associating them with predefined categories. We present a semi-supervised NER approach that starts with identifying named entities with a small set of training data. Using the identified named entities, the word and the context features are used to define the pattern. This pattern of each named entity category is used as a seed pattern to identify the named entities in the test set. Pattern scoring and tuple value score enables the generation of the new patterns to identify the named entity categories. We have evaluated the proposed system for English language with the dataset of tagged (IEER) and untagged (CoNLL 2003) named entity corpus and for Tamil language with the documents from the FIRE corpus and yield an average f-measure of 75% for both the languages.
In non-parametric statistics, a kernel is a weighting function used in non-parametric estimation techniques. A kernel is a non-negative real-valued symmetric and integrable function K. Several types of kernel functions are commonly used: uniform, triangle, Epanechnikov, quartic (biweight), tricube, triweight, Gaussian, quadratic and cosine. In this presentation we will talk about the properties and applications of kernel functions.
Supervised WSD Using Master- Slave Voting Techniqueiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Représentation sous forme de graphe d'états
Global Problem Solver
Algorithmes de Recherche Aveugles
Algorithmes de Recherche Informés
Depth First Search
Breadth First Search
Best First Search
A, A*
Fonction heuristique, Fonction heuristique admissible
A hybrid composite features based sentence level sentiment analyzerIAESIJAI
Current lexica and machine learning based sentiment analysis approaches
still suffer from a two-fold limitation. First, manual lexicon construction and
machine training is time consuming and error-prone. Second, the
prediction’s accuracy entails sentences and their corresponding training text
should fall under the same domain. In this article, we experimentally
evaluate four sentiment classifiers, namely support vector machines (SVMs),
Naive Bayes (NB), logistic regression (LR) and random forest (RF). We
quantify the quality of each of these models using three real-world datasets
that comprise 50,000 movie reviews, 10,662 sentences, and 300 generic
movie reviews. Specifically, we study the impact of a variety of natural
language processing (NLP) pipelines on the quality of the predicted
sentiment orientations. Additionally, we measure the impact of incorporating
lexical semantic knowledge captured by WordNet on expanding original
words in sentences. Findings demonstrate that the utilizing different NLP
pipelines and semantic relationships impacts the quality of the sentiment
analyzers. In particular, results indicate that coupling lemmatization and
knowledge-based n-gram features proved to produce higher accuracy results.
With this coupling, the accuracy of the SVM classifier has improved to
90.43%, while it was 86.83%, 90.11%, 86.20%, respectively using the three
other classifiers.
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings. Determining the most qualitative word embeddings is of crucial importance for such models. However, selecting the appropriate word embeddings is a perplexing task since the projected embedding space is not intuitive to humans. In this paper, we explore different approaches for creating distributed word representations. We perform an intrinsic evaluation of several state-of-the-art word embedding methods. Their performance on capturing word similarities is analysed with existing benchmark datasets for word pairs similarities. The research in this paper conducts a correlation analysis between ground truth word similarities and similarities obtained by different word embedding methods.
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings. Determining the most qualitative word embeddings is of crucial importance for such models. However, selecting the appropriate word embeddings is a perplexing task since the projected embedding space is not intuitive to humans.In this paper, we explore different approaches for creating distributed word representations. We perform an intrinsic evaluation of several state-of-the-art word embedding methods. Their performance on capturing word similarities is analysed with existing benchmark datasets for word pairs similarities. The research in this paper conducts a correlation analysis between ground truth word similarities and similarities obtained by different word embedding methods.
Sentiment Analysis typically refers to using natural language processing, text analysis, and computational linguistics to extract effect and emotion-based information from text data. Our work explores how we can effectively use deep neural networks in transfer learning and joint dual input learning settings to effectively classify sentiments and detect hate speech in Hindi and Bengali data.
Derric A. Alkis C
Abstract:
Delivering the customer to a high degree of confidence and the seller for more information about the products and the desire of customers through the use of modern technology and Machine Learning through comments left on the product to see and evaluate the comments added later and thus evaluate the product, whether good or bad.
An Approach for Big Data to Evolve the Auspicious Information from Cross-DomainsIJECEIAES
Sentiment analysis is the pre-eminent technology to extract the relevant information from the data domain. In this paper cross domain sentimental classification approach Cross_BOMEST is proposed. Proposed approach will extract †ve words using existing BOMEST technique, with the help of Ms Word Introp, Cross_BOMEST determines †ve words and replaces all its synonyms to escalate the polarity and blends two different domains and detects all the self-sufficient words. Proposed Algorithm is executed on Amazon datasets where two different domains are trained to analyze sentiments of the reviews of the other remaining domain. Proposed approach contributes propitious results in the cross domain analysis and accuracy of 92 % is obtained. Precision and Recall of BOMEST is improved by 16% and 7% respectively by the Cross_BOMEST.
Chunker Based Sentiment Analysis and Tense Classification for Nepali Textkevig
The article represents the Sentiment Analysis (SA) and Tense Classification using Skip gram model for the word to vector encoding on Nepali language. The experiment on SA for positive-negative classification is carried out in two ways. In the first experiment the vector representation of each sentence is generated by using Skip-gram model followed by the Multi-Layer Perceptron (MLP) classification and it is observed that the F1 score of 0.6486 is achieved for positive-negative classification with overall accuracy of 68%. Whereas in the second experiment the verb chunks are extracted using Nepali parser and carried out the similar experiment on the verb chunks. F1 scores of 0.6779 is observed for positive -negative classification with overall accuracy of 85%. Hence, Chunker based sentiment analysis is proven to be better than sentiment analysis using sentences. This paper also proposes using a skip-gram model to identify the tenses of Nepali sentences and verbs. In the third experiment, the vector representation of each sentence is generated by using Skip-gram model followed by the Multi-Layer Perceptron (MLP)classification and it is observed that verb chunks had very low overall accuracy of 53%. In the fourth experiment, conducted for Tense Classification using Sentences resulted in improved efficiency with overall accuracy of 89%. Past tenses were identified and classified more accurately than other tenses. Hence, sentence based tense classification is proven to be better than verb Chunker based sentiment analysis.
Chunker Based Sentiment Analysis and Tense Classification for Nepali Textkevig
The article represents the Sentiment Analysis (SA) and Tense Classification using Skip gram model for the word to vector encoding on Nepali language. The experiment on SA for positive-negative classification is carried out in two ways. In the first experiment the vector representation of each sentence is generated by using Skip-gram model followed by the Multi-Layer Perceptron (MLP) classification and it is observed that the F1 score of 0.6486 is achieved for positive-negative classification with overall accuracy of 68%. Whereas in the second experiment the verb chunks are extracted using Nepali parser and carried out the similar experiment on the verb chunks. F1 scores of 0.6779 is observed for positive -negative classification with overall accuracy of 85%. Hence, Chunker based sentiment analysis is proven to be better than sentiment analysis using sentences. This paper also proposes using a skip-gram model to identify the tenses of Nepali sentences and verbs. In the third experiment, the vector representation of each sentence is generated by using Skip-gram model followed by the Multi-Layer Perceptron (MLP)classification and it is observed that verb chunks had very low overall accuracy of 53%. In the fourth experiment, conducted for Tense Classification using Sentences resulted in improved efficiency with overall accuracy of 89%. Past tenses were identified and classified more accurately than other tenses. Hence, sentence based tense classification is proven to be better than verb Chunker based sentiment analysis.
Evaluating sentiment analysis and word embedding techniques on BrexitIAESIJAI
In this study, we investigate the effectiveness of pre-trained word embeddings for sentiment analysis on a real-world topic, namely Brexit. We compare the performance of several popular word embedding models such global vectors for word representation (GloVe), FastText, word to vec (word2vec), and embeddings from language models (ELMo) on a dataset of tweets related to Brexit and evaluate their ability to classify the sentiment of the tweets as positive, negative, or neutral. We find that pre-trained word embeddings provide useful features for sentiment analysis and can significantly improve the performance of machine learning models. We also discuss the challenges and limitations of applying these models to complex, real-world texts such as those related to Brexit.
Enhanced sentiment analysis based on improved word embeddings and XGboost IJECEIAES
Sentiment analysis is a well-known and rapidly expanding study topic in natural language processing (NLP) and text classification. This approach has evolved into a critical component of many applications, including politics, business, advertising, and marketing. Most current research focuses on obtaining sentiment features through lexical and syntactic analysis. Word embeddings explicitly express these characteristics. This article proposes a novel method, improved words vector for sentiments analysis (IWVS), using XGboost to improve the F1-score of sentiment classification. The proposed method constructed sentiment vectors by averaging the word embeddings (Sentiment2Vec). We also investigated the Polarized lexicon for classifying positive and negative sentiments. The sentiment vectors formed a feature space to which the examined sentiment text was mapped to. Those features were input into the chosen classifier (XGboost). We compared the F1-score of sentiment classification using our method via different machine learning models and sentiment datasets. We compare the quality of our proposition to that of baseline models, term frequency-inverse document frequency (TF-IDF) and Doc2vec, and the results show that IWVS performs better on the F1-measure for sentiment classification. At the same time, XGBoost with IWVS features was the best model in our evaluation.
The sarcasm detection with the method of logistic regressionEditorIJAERD
The prediction analysis is approach which may predict future possibilities. This research work is based on the
sarcasm detection from the text data. In the previous time SVM classification is applied for the sarcasm detection. The SVM
classifier classifies data based on the hyper plane which give low accuracy. To improve accuracy for sarcasm detection
logistic regression is applied during this work. The existing and proposed techniques are implemented in python and results
are analysed in terms of accuracy, execution time. The proposed approach has high accuracy and low execution time as
compared to SVM classifier for sarcasm detection.
The sarcasm detection with the method of logistic regression
228-SE3001_2
1.
Abstract—There are many robust approaches algorithms for
Word Sense Disambiguation using machine learning, and it is
very difficult to make comparisons between them if we don’t
implementation empirically. In this word, analysis and
developed JAVA Code and compare between two of the most
successfully approaches for supervised machine learning,
namely, Naïve Bayes and Decision tree using WordNet and
Senseval3 for Word Sense Disambiguation of words in context.
While comparing these two approaches, paper deals with
supervised learning method proving effective results. These
algorithms refer common data set, training file and testing file to
calculate accuracy to predict exact meaning of sense.
Index Terms —Supervised, Naïve bayes, decision tree, WSD,
WordNet.
I. INTRODUCTION
In last year’s, there are many researchers widely search
empirically in NLP field, and WSD problem. The task remove
the ambiguity from the words and select proper sense called
WSD, and this task require examine the word in context and
determine which sense can be use. There are many words
have multiple meaning according to the context of speech. For
example the word (Recompense) has different meaning in
context, as in screenshot below in Fig. 1 [1]:
Fig. 1. The screenshot from WordNet shows the multiple of recompense
word.
The experiments in this field proved there are many
methods can used it and adopt it in this domain, by analysis
and test each of them empirically, to prove the objectives of
research successfully or not.
Our goal is remove the ambiguity from the word by select
the correct sense that annotated from WordNet. Word sense
disambiguation is the task that examines the word in context
and selects the proper meaning among many senses or
meaning related with the word. WSD task so important for
many purposes in natural language processing, like, machine
learning, information retrieval, and so on of natural language
processing purposes [2].
This study is one of the experiments of our PhD research
work currently, to complete master- slave technique [3]. We
focused in this paper on two of supervised methods, one of
them decision tree which based on classifications rules, and
the second one Naïve bayes, one of probabilistic learning
methods. We presented analysis and comparison between
these two supervised learning approaches stared from
selection data set, training, testing data, till calculate the
results of them [4].
However, now days the word sense disambiguation still
open problem in natural language processing domain, and
there is a scope to enhancement the accuracy of selection
proper sense [5].
II. MOTIVATION AND APPLICATION
Where a input is accepted and perception of user influence
the result to be displayed especially search engine which
displays result after accepting input from user. Every domain
which works on same concept where input is accepted to
deliver output according to the result. Every NLP application
where result could be affected by correct or incorrect
interpretation[5].
There are many applications for word sense
disambiguation such as:
Information Retrieval [6]: Data is retrieved by using
search engine or likewise interface if we do not mention sense.
Especially there is no way that system can analyze it meaning
without help of WSD technique. There are many examples
which would lead to incorrect meaning of given word, like
[bank, finance related bank], [bank, Edge of river].
Machine translation: Convey information to machine
correctly, so that further conversion using intermediate code
could be carried out; for example, flexographic conversion.
Phonetics: Disambiguation could occur not only in NLP
but also at domain where there is conversion from speech to
text. For example, [eat – to consume food or so, it - pronoun,
pronoun station could lead to different text and there by
Comparative Analysis between Naïve Bayes Algorithm
and Decision Tree to Solve WSD Using Empirical
Approach
Lecture Notes on Software Engineering, Vol. 4, No. 1, February 2016
82DOI: 10.7763/LNSE.2016.V4.228
Boshra F. Zopon Al-Bayaty and Shashank Joshi
Manuscript received September 2, 2014; revised November 10, 2014.
Boshra F. Zopon Al-Bayaty is with Bharati Vidyapeeth University, Pune,
India. And she is also with Al-Mustansiriya University, Baghdad, Iraq
(e-mail: bushraalbayaty123@gmail.com).
Shashank Joshi is with Bharati Vidyapeeth University, Pune, India.
2. different meaning.
III. RELATED WORK
Word sense disambiguation one of the open problem in
NLP, plenty of work is carried out to solve this problem, but
there is lot of scope to contribute in this field to identify sense
of given word correctly. Generally disambiguation is resolved
by using many approaches, the main approaches include [7],
[8]:
Supervised Approaches: Where system is trained to
correctly identify meaning of particular word.
Unsupervised Approaches: Based on the group or
collection of required data result is fetched.
There are many robust algorithms like Naïve bayes, SVM,
decision tree, decision list, KNN, and so on which could be
used to address word sense disambiguation.
IV. SUPERVISED MACHINE EARNING APPROACHES
Machine learning approaches can be used to discover the
relationships in the internal structure of the data and
production outputs are correct. These approaches composed
Naïve bayes, decision list, decision tree, support vector
machines and some of supervised machine learning methods.
TABLE I: NAÏVE BAYES ATTRIBUTE EXAMPLE
Attribute
X 3 MP Camera, 5 inch screen, 50 gm weight ,
black color
Y 10 MP Camera, 4.8 inch screen water poof ,
purple color
Z 8 MP Camera, 6 inch screen, 100 gm weight ,
white color
A. Naïve Bayes Approach
This approach is one of the important algorithms used in
data mining. It is based on conditional probability. In naïve
bayes algorithm information about various objects and their
properties is collected during the training phase system is
trained to identify new object based on respective attributes of
these objects. For example selection of mobile this is added
and identify. Its category based on the information available
related with attributes. Consider a scenario where three
mobiles, X, Y, Z are described [11].
System is trained with this information and when we want
identify any new Mobile individual attributes are evaluated
and match is found.
For implementing WordNet data source is used this is
repository which provides the mapping of word and different
sense associated with that word. For performing on
experiment we referred data set 10 nouns and 5 verbs which
contain following words [12]:
Data set of pos (n) = {Praise, Name, Lord, Worlds, Owner,
Recompense, Straight, Path, Anger, Day}.
Data set of pos (v) = {Worship, Rely, Guide, Favored,
Help}.
Box 1. Naive bayes algorithm implemented on our data set.
1) Naïve bayes network
In this section, Naive Baysian classifier has been implemented
for instance word “Path “from our data set with the four
senses (s), the calculations involved as mentioned in Fig. 2:
P (s1) = 3/14 = 0.214
P (s2) = 2/14 = 0.142
P (s3) = 5/14 = 0.357
P (s4) = 4/14 = 0.285
Fig. 2. The screenshot from WordNet shows the multiple of path word with
its sense.
Consider four different words selected from bag of words –
1. Initialize context c, sense s, and ambiguous
word w.
2. As per training context
3.
)|(
)|(),|(
,|
cwp
cspcswp
cwsp
Calculate Maximized ),|( cwsp
4. Select one with highest value
5. Map sense according to the highest accuracy.
Lecture Notes on Software Engineering, Vol. 4, No. 1, February 2016
83
To use WordNet repository senseval XML mapping
technique is used [13], where the given data set and senses are
expressed with XML. And to ensure effective working of
decision tree training and testing file is used. Job of file is to
provide the context which will be extremely useful exactly
know meaning of particular word. For implementing C4.5
algorithm eclipse ID2, is used, while implementing it
equations related with entropy are implemented. Below the
algorithm we applied:
In this work we tried to do a comparison between the well
known supervised learning approaches, Naïve bayes and
decision tree which both have long successful history in this
field. Naïve bayes is the most commonly approach used in
Word Sense Disambiguation, we have implemented the
algorithms using WordNet 2.1, and our study to Naïve Bayes
achieved (62.86 %) accuracy to the Senseval-3 [9]. And
according to results from implementation decision tree we
achieved (45.14%) accuracy [10].
Our goal is to see which one is the most successful in
performance through a comparison between the two
algorithms and study the factors affecting then and the
possibility of improving the performance of each and improve
the accuracy, by combining them together in future, showing
in Table I.
3. (1.(f1) Travel, 2.(f2) Life, 3.(f3)Course, 4.(f4) Which. For
sense s3-“pattern”:-
F1= 2/14 = 0.142,
F2=1/14 = 0.0714
F3= 2/14 = 0.142,
F4= 3/14= 0.214
For designing baysian Network:
1 0.142
0.397
0.357
2 0.0714
0.200
0.357
3 0.142
0.397
0.357
4 0.214
0.599
0.357
F
s
F
s
F
s
F
s
Baysian networks gives details about contribution of
individual feature and one of the sense from number of senses
available of given word. Word sense feature (combines the
baysian network). It helps to check individual assessment of
every feature (F1, F2, F3, F4), seeing in Fig. 3.
Fig. 3. The naïve bayesian network.
B. Decision Tree
Box 2. C4.5 algorithm implemented on our data set.
Decision tree is a predicative model, which helps to take
decision on the statistics’ available (past information). In a
decision tree branches provides attribute or related condition
on which decision is made in the form of nodes (Yes or No)
[14]. If clear decision is not made by branches then
information gain is checked whichever nod has high
information gain that node is declared as correct or final
decision. In C4.5 algorithm every time information gain is
calculated for entropy which is useful in making decisions.
Consider a simple example of whether for casting in which
decisions are made or predicated to remove uncertainty. If
clouds are dense in sky there will be rain. If there is rain then
temperature will get decreased and humidity will get
increased [15].
In this decisions can be made bored on the available
information to decide the destination on the basis of highest
value of information gain. Box 2. Shows the algorithm we
applied in our study.
V. WORD SENSE DISAMBIGUATION EXPERIMENTS WITH
NAÏVE BAYES AND DECISION TREE
A. Dataset
We started with dataset provided by the
http://www.e-quran.com/language/english, the dataset
composed 15 English words, 10 nouns and 5 verbs, such as
path, help which are have ambiguity. Since one of particular
steps are goal is to train the dataset and to disambiguate the
words by selecting the proper meaning in context [16], we
used WordNet, which is available at
http://wordnet.princeton.edu, to provide the sense of words
information’s. And to make sure, test, evaluation the both
approaches and properly assigned to word, we used
practically senseval-3 in empirical our work, seeing in Table
II.
TABLE II: DECISION TREE ATTRIBUTE EXAMPLE
Attribute Decision Statues
Rain
Yes Less temp
------No More temp
Temperature
Yes Less humidity
cannot sayNo More humidity
Humidity
Yes Less temp cannot say
No More temp
B. Analysis
Results acquired by naïve bayes approach and decision tree
approaches are compared for some cases, Naïve bayes
approach gives better result and other decision tree is more
efficient. If size of tree is less then decision tree gives better
result. Overall accuracy of Naïve bayes accuracy of decision
tree.
C. Modeling
1) Dictionary Data source.
2) Training context providing base for context.
3) Testing verification of data and its meaning.
4) Sense Map Mapping between word and sense.
1. Read data set and calculation POS (e.g. recompense.)
2. Prepare context containing various senses of word (e.g.
Recompense- reward)
3. Calculate frequency at context (i.e. - p- and +P+)
-P- Negative
-P+ Positive
4. Calculate information gain for calculating entropy (S) =
-P+log2P+-P-log2P-
5. Gain (S,A)= Entropy(S) -
DAv S
Sv
||
|| Entropy (Sv)
6. Select highest (Entropy, Attribute ratio)
7. E.g. (S,A) for recompense = 0.593
For = reward
F1=Travel
F2= Life
F3=Course F4=which
S=pattern
Lecture Notes on Software Engineering, Vol. 4, No. 1, February 2016
84
Application is made up of number of modules some of
imp.mod areas below [17]:
4. Apart from this there are, many packages, classes calculate
accuracy of sense.
D. Design
To address word sense disambiguation semi- structured
data is used to enhance the performance. Algorithm along
with given context will train system to judge correct sense,
which is further verified by the testing file to ensure correct
meaning of sense, seeing in Table III.
E. Training
Data set of 10 nouns and 5 verbs is used. To make
understanding of senses, system is trained by referring
senseval-3 structure to map word with sense by using
surrounding context. This entire structure uses XML format
to represent and process data using semi structured approach.
TABLE III: DATA SET OF WORDS AND RESULTS OF NAÏVE BAYES AND DECISION TREE CLASSIFIERS
Word POS #
Sense
Naïve Bayes Decision Tree
Score Accuracy Score Accuracy
Praise n 2 0.408 0.592 405 593
Name n 6 0.189 1.0 184 1000
Worship v 3 0.172 0.414 308 425
Worlds n 8 0.137 1.0 1000 1000
Lord n 3 0.341 0.681 187 426
Owner n 2 0.406 0.594 405 595
Recompense n 2 0.48 0.594 405 595
Trust v 6 0.167 0.167 167 167
Guide v 5 0.352 0.648 199 247
Straight n 3 0.496 0.504 462 462
Path n 4 0.415 0.585 316 316
anger n 3 0.412 0.588 462 462
Day n 10 0.109 1.0 109 109
Favored v 4 0.587 0.648 250 250
Help v 8 0.352 0.414 125 125
F. Testing
Fig. 4. The screenshot shows taraining and compilation model.
Given data is tested with XML file which contain context
without direct mapping with sense. This approach results in
accurate prediction of sense for given word [18], seeing in Fig.
4.
G. The Execution Steps
We brief our execution steps as blow:
Data Source: Decide suitable data to be checked for WDS.
Select sample words to check behavior of algorithm. In the
experiment 10 noun and 5 verbs are used.
Dictionary: Refer format at senseval. Org. Prepare XML
format of content helping to resolve sense of data with respect
to some unique ID [19].
Algorithm: Write a code to check accuracy of the word to
predict exact sense by referring the given context for a dada
set selected as mentioned above.
Execution this algorithms in eclipse kepler to get score
made for given sense, select the sense with high accuracy as a
final result.
H. The System Answer
Results of word sense disambiguation are stored in a file
called as system Answer.txt. This file displays the score for
respective senses of word in given dataset [20].
Lecture Notes on Software Engineering, Vol. 4, No. 1, February 2016
85
The said score is calculated on the scale of 1000. Sense
having highest score of accuracy is considered as correct
sense identification. After performing the experiment overall
accuracy of Naïve Bayes algorithm is (62.86 %), and
Decision tree is (45.14 %). This accuracy is calculated on a
data set of 10 nouns and 5 verbs on the basis of context to
resolve the meaning of a word.
5. Lecture Notes on Software Engineering, Vol. 4, No. 1, February 2016
86
Result shows for some words, Naïve Bayes algorithm
provides better results for example {Name, Worlds, and Day},
for other case decision tree provide better accuracy values for
example {Name and worlds only}.
The screenshot below shows the System Answer. Txt file
for decision tree implemented.
VI. THE FINAL RESULT
Mention accuracy with high values using decision tree and
Naïve bayes approach. As a sum of overall accuracy Naïve
Bayes approach gives more accurate results, as shown in
Table IV and Fig. 5:
Fig. 5. The screenshot shows the system answer.txt file compilation model.
VII. CONCLUSION
There cannot be a 100% accurate method. It depends upon
data set context and algorithm we used to implement Word
Sense Disambiguation. Accuracy, likely to vary according to
these parameters. Still Naïve Bayes approach gives more
accurate result in same time as per experiment. Table IV
Below shows the final results of accuracy for both
approaches.
ACKNOWLEDGMENT
The first author thanks the research guide Dr. Shashank
Joshi (Professor at Bharati Vidyapeeth University, College of
Engineering) for submitted his advices within preparing this
paper.
REFERENCES
[1] Princeton. [Online]. Available: http://wordnet.princeton.edu.
[2] D. Yarowsky, “Hierarchical decision lists for word sense
disambiguation,” Computers and the Humanities, vol. 34, pp. 197-186,
2000.
[3] B. F. Zopon Al-Bayaty and S. Joshi, “Conceptualisation of knowledge
discovery from web search,” International Journal of Scientific &
Engineering Research, vol. 5, no. 2, pp. 1246-1248, February 2014.
[4] P. P. Borah, G. Talukdar, and A. Baruah, “Approaches for word sense
disambiguation – A survey,” International Journal of Recent
Technology and Engineering, vol. 3, no. 1, March 2014.
[5] N. Indurkhya and F. J. Damerau, Handbook of Natural Language
Processing, 2nd
Ed., USA: Chapman & Hall/CRC, 2010.
[6] B. F. Z. Al-Bayaty and S. Joshi, “Word sense disambiguation (WSD)
and information retrieval (IR): Literature review,” International
Journal of Advanced Research in Computer Science and Software
Engineering, vol. 4, no. 2, pp. 722-726, February 2014.
[7] M. Joshi, S. Pakhomov, and C. G. Chute, “A comparative study of
supervised learning as applied to acronym expansion in clinical
reports,” in Proc. AMIA Annu Symp, 2006, pp. 399–403.
[8] T. Pedersen, “A decision tree of bigrams is an accurate predictor of
word sense,” in Proc. Second Meeting of the North American Chapter
of the Association for Computational Linguistics on Language
Technologies, 2004, pp. 1-8.
[9] B. F. Zopon Al-Bayaty and S. Joshi, “Empirical implementation naive
bayes classifier for wsd using wordnet,” International Journal of
Computer Engineering & Technology, vol. 5, no. 8, pp. 25-31, August
2014.
[10] B. F. Zopon Al-Bayaty and S. Joshi, “Empirical implementation
decision tree classifier to WSD problem,” International Journal of
Advanced Technology in Engineering and Science, vol. 2, no. 1, pp.
597-601, 2014.
[11] H. L. Wee, “Word sense disambiguation using decision trees,”
Department of Computer Science, National University of Singapore,
2010.
[12] E-quran. [Online]. Available:
http://www.e-quran.com/language/english.
[13] Senseval. [Online]. Available: http://www.senseval.org/senseval3.
[14] D. Jurafsky and J. H. Martin, “Naïve bayes classifier approach to Word
sense disambiguation,” Computational Lexical Semantics, Sections 1
to 2, University of Groningen, 2009.
[15] G. Paliouras, V. Karkaletsis, and C. D. Spyropoulos, “Learning rules
for large vocabulary word sense disambiguation,” Institute of
Informatics & Telecommunications, NCSR “Demokritos” Aghia
Paraskevi Attikis, Athens, 15310, Greece.
[16] M. Joshi, T. Pedersen and R. Maclin, “A combative study of support
vector machines applied to he supervised word sense disambiguation
problem in the medical domain,” Department of Computer Science,
University of Minnesota, Duluth, MN 55812, USA.
[17] O. Y. Kwong, “Psycholinguistics, lexicography, and word sense
disambiguation,” in Proc. 26th
Pacific Asia Conference on Languge,
Information and Computation, 2012, pp. 408-417.
[18] G. A. Miller et al., “Introduction to WordNet: An on-line lexical
database,” International Journal of Lexicography, vol. 3, no. 4, pp.
235-244, 1990.
[19] R. Navigli, “Word sense disambiguation: A survey,” ACM Computing
Surveys, vol. 41, no. 2, 2009.
[20] A. Fujii, “Corpus – based word sense disambiguation,” PhD. Thesis,
Department of Computer Science, Tokyo Institute of Technology,
March, 1998.
Boshra F. Zopon Al-Bayaty received her B.E degree
in computer science from Al-Mustansiriya University,
College of Education in 2002. She received her M.S.C
degree in computer science from Iraqi Commission for
Computers and Informatics, Informatics Institute for
Postgraduate Studies. She is doing her the Ph.D. in
computer science at Bharati Vidyapeeth Deemed
University, Pune.
She is currently working in the Ministry of Higher Education & Scientific
Research, Al-Mustansiriyah University in Iraq, Baghdad. Her research
interest is focused on software engineering.
Shashank Joshi received his B.E. degree in electronics and
telecommunication from Govt. College of Engineering, Pune in 1988, the
M.E. and Ph. D. degrees in computer engineering from Bharati Vidyapeeth
Deemed University Pune. He is currently working as the professor in
Computer Engineering Department Bharati Vidyapeeth Deemed University
College of Engineering, Pune. His research interests include software
engineering. Presently he is engaged in SDLC and secures software
development methodologies. He is an innovative teacher devoted to
education and learning for the last 23 years.
TABLE IV: THE FINAL RESULTS OF NAÏVE BAYES AND DECISION TREE
CLASSIFIERS
Approaches Accuracy (%)
Naïve Bayes 62.86
Decision Tree 45.14