Product reviews are valuable resource for information seeking and decision making purposes. Products such as smart phone are discussed based on their aspects e.g. battery life, screen quality, etc. Knowing user statements about aspects is relevant as it will guide other users in their buying process. In this paper, we automatically extract user statements about aspects for a given product. Our extraction method is based on dependency parse information of individual reviews. The parse information is used to learn patterns and use them to determine the user statements for a given aspect. Our results show that our methods are able to extract potentially
useful statements for given aspects.
The document summarizes research on aspect-based sentiment analysis. It discusses four main tasks in aspect-based sentiment analysis: aspect term extraction, aspect term polarity identification, aspect category detection, and aspect category polarity identification. It then reviews several approaches researchers have used for each task, including supervised methods like conditional random fields and support vector machines, as well as unsupervised methods. The document concludes by comparing results from different studies on restaurant and laptop review datasets.
With the rapidly increasing growth in the field of internet and web usage, it has become essential to use a certain specific powerful tool, which should be capable to analyze and rank all these available reviews/opinion on the web/Internet. In this paper we have propose a new and effective approach which uses a powerful sentiment analysis procedure which will be based on an ontological adjustment and arrangements. This study also aims to understand pos tag order to get detailed observation for any review or opinion, it also helps in identifying all present positive /Negative sentiments and suggest a proper sentence inclination. For this we have used reviews available on internet regarding Nokia and Stanford parser for the purpose or pos tagging.
Co-Extracting Opinions from Online ReviewsEditor IJCATR
Exclusion of opinion targets and words from online reviews is an important and challenging task in opinion mining. The
opinion mining is the use of natural language processing, text analysis and computational process to identify and recover the subjective
information in source materials. This paper propose a Supervised word alignment model, which identifying the opinion relation. Rather
than this paper focused on topical relation, in which to extract the relevant information or features only from a particular online reviews.
It is based on feature extraction algorithm to identify the potential features. Finally the items are ranked based on the frequency of
positive and negative reviews. Compared to previous methods, our model captures opinion relation and feature extraction more precisely.
One of the most advantages that our model obtain better precision because of supervised alignment model. In addition, an opinion
relation graph is used to refer the relationship between opinion targets and opinion words.
IRJET- Sentimental Analysis of Product Reviews for E-Commerce WebsitesIRJET Journal
This document summarizes a research paper that proposes using sentiment analysis of product reviews on e-commerce websites to help consumers decide where to purchase a product. The researchers describe collecting reviews from multiple websites, preprocessing the text, using clustering and classification algorithms like mean shift and support vector machines to label reviews as positive, negative or neutral. The system would then compare the results across websites and recommend the one with the most positive reviews to reduce the time users spend researching. Future work could include detecting fake reviews and identifying reasons for negative reviews on particular sites.
This document proposes a model to estimate overall sentiment score by applying rules of inference from discrete mathematics. It discusses sentiment analysis and related work using techniques like supervised/unsupervised learning. The problem is identifying sentiment components and restricting patterns for feature identification. Most approaches focus on nouns/adjectives but not verbs/adverbs. The model preprocesses product review datasets using NLTK for stemming, parsing and tokenizing. It builds a lexicon dictionary of positive and negative words. The Lexical Pattern Sentiment Analysis algorithm uses both lexicon and pattern mining - it selects sentence patterns, checks for positive/negative words in the lexicon, and calculates an overall sentiment score.
This document discusses various techniques for sentiment analysis of application reviews, including both statistical and natural language processing approaches. It describes how sentiment analysis can be used to analyze textual reviews and classify them as positive or negative. Several key techniques are discussed, such as using machine learning classifiers like Naive Bayes, extracting n-grams and sentiment-oriented words, and developing rule-based models using techniques like identifying parts of speech. The document also discusses using these techniques to perform sentiment analysis at both the document and aspect levels.
Methods for Sentiment Analysis: A Literature Studyvivatechijri
Sentiment analysis is a trending topic, as everyone has an opinion on everything. The systematic
study of these opinions can lead to information which can prove to be valuable for many companies and
industries in future. A huge number of users are online, and they share their opinions and comments regularly,
this information can be mined and used efficiently. Various companies can review their own product using
sentiment analysis and make the necessary changes in future. The data is huge and thus it requires efficient
processing to collect this data and analyze it to produce required result.
In this paper, we will discuss the various methods used for sentiment analysis. It also covers various techniques
used for sentiment analysis such as lexicon based approach, SVM [10], Convolution neural network,
morphological sentence pattern model [1] and IML algorithm. This paper shows studies on various data sets
such as Twitter API, Weibo, movie review, IMDb, Chinese micro-blog database [9] and more. The paper shows
various accuracy results obtained by all the systems.
The document summarizes research on aspect-based sentiment analysis. It discusses four main tasks in aspect-based sentiment analysis: aspect term extraction, aspect term polarity identification, aspect category detection, and aspect category polarity identification. It then reviews several approaches researchers have used for each task, including supervised methods like conditional random fields and support vector machines, as well as unsupervised methods. The document concludes by comparing results from different studies on restaurant and laptop review datasets.
With the rapidly increasing growth in the field of internet and web usage, it has become essential to use a certain specific powerful tool, which should be capable to analyze and rank all these available reviews/opinion on the web/Internet. In this paper we have propose a new and effective approach which uses a powerful sentiment analysis procedure which will be based on an ontological adjustment and arrangements. This study also aims to understand pos tag order to get detailed observation for any review or opinion, it also helps in identifying all present positive /Negative sentiments and suggest a proper sentence inclination. For this we have used reviews available on internet regarding Nokia and Stanford parser for the purpose or pos tagging.
Co-Extracting Opinions from Online ReviewsEditor IJCATR
Exclusion of opinion targets and words from online reviews is an important and challenging task in opinion mining. The
opinion mining is the use of natural language processing, text analysis and computational process to identify and recover the subjective
information in source materials. This paper propose a Supervised word alignment model, which identifying the opinion relation. Rather
than this paper focused on topical relation, in which to extract the relevant information or features only from a particular online reviews.
It is based on feature extraction algorithm to identify the potential features. Finally the items are ranked based on the frequency of
positive and negative reviews. Compared to previous methods, our model captures opinion relation and feature extraction more precisely.
One of the most advantages that our model obtain better precision because of supervised alignment model. In addition, an opinion
relation graph is used to refer the relationship between opinion targets and opinion words.
IRJET- Sentimental Analysis of Product Reviews for E-Commerce WebsitesIRJET Journal
This document summarizes a research paper that proposes using sentiment analysis of product reviews on e-commerce websites to help consumers decide where to purchase a product. The researchers describe collecting reviews from multiple websites, preprocessing the text, using clustering and classification algorithms like mean shift and support vector machines to label reviews as positive, negative or neutral. The system would then compare the results across websites and recommend the one with the most positive reviews to reduce the time users spend researching. Future work could include detecting fake reviews and identifying reasons for negative reviews on particular sites.
This document proposes a model to estimate overall sentiment score by applying rules of inference from discrete mathematics. It discusses sentiment analysis and related work using techniques like supervised/unsupervised learning. The problem is identifying sentiment components and restricting patterns for feature identification. Most approaches focus on nouns/adjectives but not verbs/adverbs. The model preprocesses product review datasets using NLTK for stemming, parsing and tokenizing. It builds a lexicon dictionary of positive and negative words. The Lexical Pattern Sentiment Analysis algorithm uses both lexicon and pattern mining - it selects sentence patterns, checks for positive/negative words in the lexicon, and calculates an overall sentiment score.
This document discusses various techniques for sentiment analysis of application reviews, including both statistical and natural language processing approaches. It describes how sentiment analysis can be used to analyze textual reviews and classify them as positive or negative. Several key techniques are discussed, such as using machine learning classifiers like Naive Bayes, extracting n-grams and sentiment-oriented words, and developing rule-based models using techniques like identifying parts of speech. The document also discusses using these techniques to perform sentiment analysis at both the document and aspect levels.
Methods for Sentiment Analysis: A Literature Studyvivatechijri
Sentiment analysis is a trending topic, as everyone has an opinion on everything. The systematic
study of these opinions can lead to information which can prove to be valuable for many companies and
industries in future. A huge number of users are online, and they share their opinions and comments regularly,
this information can be mined and used efficiently. Various companies can review their own product using
sentiment analysis and make the necessary changes in future. The data is huge and thus it requires efficient
processing to collect this data and analyze it to produce required result.
In this paper, we will discuss the various methods used for sentiment analysis. It also covers various techniques
used for sentiment analysis such as lexicon based approach, SVM [10], Convolution neural network,
morphological sentence pattern model [1] and IML algorithm. This paper shows studies on various data sets
such as Twitter API, Weibo, movie review, IMDb, Chinese micro-blog database [9] and more. The paper shows
various accuracy results obtained by all the systems.
One fundamental problem in sentiment analysis is categorization of sentiment polarity. Given a piece of written text, the problem is to categorize the text into one specific sentiment polarity, positive or negative (or neutral). Based on the scope of the text, there are three distinctions of sentiment polarity categorization, namely the document level, the sentence level, and the entity and aspect level. Consider a review “I like multimedia features but the battery life sucks.†This sentence has a mixed emotion. The emotion regarding multimedia is positive whereas that regarding battery life is negative. Hence, it is required to extract only those opinions relevant to a particular feature (like battery life or multimedia) and classify them, instead of taking the complete sentence and the overall sentiment. In this paper, we present a novel approach to identify pattern specific expressions of opinion in text.
Summarization and opinion detection of product reviews (1)Lokesh Mittal
This document describes a project to generate summaries of product reviews. It scrapes reviews from websites, extracts features and identifies opinions as positive or negative. It uses dependency parsing to extract features and SentiWordNet to determine opinion orientation. The system generates a summary with the most common features and percentages of positive and negative opinions for each feature. Evaluation compares the extracted features and opinions to manual analysis. Future work includes improving pronoun resolution, opinion strength and other linguistic opinions.
Opinion mining of movie reviews at document levelijitjournal
The whole world is changed rapidly and using the current technologies Internet becomes an essential
need for everyone. Web is used in every field. Most of the people use web for a common purpose like
online shopping, chatting etc. During an online shopping large number of reviews/opinions are given by
the users that reflect whether the product is good or bad. These reviews need to be explored, analyse and
organized for better decision making. Opinion Mining is a natural language processing task that deals
with finding orientation of opinion in a piece of text with respect to a topic. In this paper a document
based opinion mining system is proposed that classify the documents as positive, negative and neutral.
Negation is also handled in the proposed system. Experimental results using reviews of movies show the
effectiveness of the system.
A SURVEY PAPER ON EXTRACTION OF OPINION WORD AND OPINION TARGET FROM ONLINE R...ijiert bestjournal
Opinion mining is nothing but mining opinion target s and opinion words from online reviews. To find op inion relation among them partially supervised word align ment model have used. To find confidence of each candidate graph based co-ranking algorithm have used. Further candidates having confidence higher than threshold value are extracted as opinion word or opinion targets. Compa red to previous approach syntax-based method this m ethod can give correct results by eliminating parsing errors and can work on reviews in informal language. Compa red to nearest neighbor method this method can give more p recise results and can find relations within a long span. Also to decrease error propagation graph based co-r anking algorithm is used to collectively extract op inion targets and opinion words. Also to decrease probability of error generation penetration of high degree vertice s is done and decrease effect of random walk.
"Knowing about the user’s feedback can come to a greater aid in knowing the user as well as improving the organization. Here an example of student’s data is taken for study purpose. Analyzing the student feedback will help to help to address student related problems and help to make teaching more student oriented. Prashali S. Shinde | Asmita R. Kanase | Rutuja S. Pawar | Yamini U. Waingankar ""Sentiment Analysis of Feedback Data"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | Fostering Innovation, Integration and Inclusion Through Interdisciplinary Practices in Management , March 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23090.pdf
Paper URL: https://www.ijtsrd.com/other-scientific-research-area/other/23090/sentiment-analysis-of-feedback-data/prashali--s-shinde"
INFERENCE BASED INTERPRETATION OF KEYWORD QUERIES FOR OWL ONTOLOGYIJwest
This paper presents a model for interpreting keyword queries over OWL ontologies that considers OWL axioms and restrictions to provide more precise answers to user queries. The model maps keywords from user queries to ontological elements and identifies phrases to select an appropriate SPARQL query template. The template is populated using inferred results from applying OWL restrictions to generate a formal SPARQL query for execution over the ontology. By addressing OWL features, the model aims to leverage the full capabilities of OWL knowledge bases to better understand users' information needs.
2005-Model-guided information discovery for intelligence analysis-p269-alonsoHua Li, PhD
This document describes an approach to modeling intelligence analysts to guide information discovery. The Analyst Modeling Environment (AME) dynamically models an analyst's interests using a concept map that represents concepts, relationships between concepts, and associated interest levels. The model is adapted over time based on the analyst's interactions and feedback. An evaluation compared using AME to guide searches via the model to searches using Google alone, finding that the AME approach improved precision and recall.
A Survey on Evaluating Sentiments by Using Artificial Neural NetworkIRJET Journal
This document discusses sentiment analysis using artificial neural networks. It begins with an abstract that introduces sentiment analysis and machine learning approaches used, including Naive Bayes, maximum entropy, and support vector machines. It then provides more detail on a survey of machine learning techniques for sentiment analysis, focusing on neural networks. The document proposes using a combination of neural networks and fuzzy logic to improve sentiment classification accuracy by better handling correlations between variables.
Using Hybrid Approach Analyzing Sentence Pattern by POS Sequence over TwitterIRJET Journal
This document presents a study that uses part-of-speech (POS) sequence analysis to determine sentence patterns in tweets for sentiment analysis purposes. The study extracts 2-tag and 3-tag POS sequences from tweets and uses information gain to select the top sequences. Supervised classification with support vector machines is then performed using the POS sequences as features. The results show distinguishable sentence pattern groups for positive and negative tweets, and incorporating POS sequences can improve sentiment analysis accuracy compared to using lexicons alone.
This document proposes a system to detect fake product reviews on e-commerce sites. The system uses sentiment analysis, content similarity analysis, and review deviation analysis to identify fake reviews. It extracts product reviews from websites, preprocesses the data, and uses three techniques to detect fake reviews. The fake reviews are then used to train a classifier to label new reviews as fake or genuine. The system was able to detect 111 fake reviews out of 300 with the classifier identifying an additional 18 fake reviews. The techniques aim to make online shopping reviews more trustworthy.
Feature Based Semantic Polarity Analysis Through OntologyIOSR Journals
This document summarizes a research paper that proposes an opinion mining methodology using ontologies and natural language processing techniques to perform feature-based sentiment analysis of customer reviews. It begins by collecting customer reviews from websites. The reviews are preprocessed by removing URLs, usernames, etc. and performing part-of-speech tagging to extract product features. An ontology is constructed to organize the features and their relationships. Term frequencies are calculated to determine feature importance. Sentiment scores from -5 to 5 are assigned to each feature using a sentiment analysis tool and N-gram analysis. The methodology is evaluated using precision, recall, and F-measure. The feature-level sentiment analysis provides more detailed and helpful information for customers and developers compared to document-level
Unsupervised Main Entity Extraction from News Articles using Latent VariablesJinho Choi
This document presents a methodology for semi-unsupervised main entity extraction from news articles using latent variables. It trains a semi-supervised model using only semantic and lexical information from raw text to automatically extract main entities from articles. The extracted entities are evaluated based on word sequence matches between the entities and news article titles, with the evaluation metric for this task needing improvement.
This document summarizes a research paper that proposes a method for performing sentiment analysis on product reviews to identify promising product features. It involves scraping short reviews from websites, preprocessing the text through cleaning, tokenization and part-of-speech tagging. Next, it uses pattern mining and a custom lexicon dictionary to determine the overall sentiment score and sentiment scores for specific product features. The goal is to analyze which features consumers view most positively to help businesses understand customer preferences.
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...IRJET Journal
This document discusses using machine learning approaches to perform sentiment analysis on students' feedback. Specifically, it proposes using a random forest classifier to analyze descriptive feedback collected through an online student portal and classify it as having positive, negative, or neutral sentiment. The proposed system would collect real-time feedback, preprocess it by removing stop words and tagging parts of speech, extract sentiment-related features, and use the trained random forest model to classify unseen feedback with 90% accuracy. The goal is to more accurately analyze both objective and descriptive feedback to evaluate teacher performance.
This document presents a statistical weighted approach for performing sentiment analysis on movie reviews to identify overall and feature-level sentiments. It involves identifying adjectives from reviews and classifying them as positive or negative based on their score in a database. Feature extraction is also performed to analyze sentiments for specific movie aspects. Individual review sentiments are identified and then aggregated to determine overall sentiment scores for movies. The approach is tested on real movie review datasets and able to accurately derive positive or negative sentiment scores for different movies based on review analysis.
Opinion mining on newspaper headlines using SVM and NLPIJECEIAES
Opinion Mining also known as Sentiment Analysis, is a technique or procedure which uses Natural Language processing (NLP) to classify the outcome from text. There are various NLP tools available which are used for processing text data. Multiple research have been done in opinion mining for online blogs, Twitter, Facebook etc. This paper proposes a new opinion mining technique using Support Vector Machine (SVM) and NLP tools on newspaper headlines. Relative words are generated using Stanford CoreNLP, which is passed to SVM using count vectorizer. On comparing three models using confusion matrix, results indicate that Tf-idf and Linear SVM provides better accuracy for smaller dataset. While for larger dataset, SGD and linear SVM model outperform other models.
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET Journal
This document summarizes research on sentiment polarity analysis of Twitter data from different events. It discusses how Twitter data can be used for opinion mining and sentiment analysis. Several papers that used techniques like naive Bayes classifier, support vector machines, and dual sentiment analysis on Twitter data are summarized. The document also provides an overview of the key steps involved in a Twitter sentiment analysis system, including data collection, preprocessing, feature extraction, training a classification model, and evaluating accuracy. The goal of analyzing sentiments on Twitter is to understand public opinions on different topics and events.
Mining of product reviews at aspect levelijfcstjournal
Today’s world is a world of Internet, almost all work can be done with the help of it, from simple mobile
phone recharge to biggest business deals can be done with the help of this technology. People spent their
most of the times on surfing on the Web; it becomes a new source of entertainment, education,
communication, shopping etc. Users not only use these websites but also give their feedback and
suggestions that will be useful for other users. In this way a large amount of reviews of users are collected
on the Web that needs to be explored, analyse and organized for better decision making. Opinion Mining or
Sentiment Analysis is a Natural Language Processing and Information Extraction task that identifies the
user’s views or opinions explained in the form of positive, negative or neutral comments and quotes
underlying the text. Aspect based opinion mining is one of the level of Opinion mining that determines the
aspect of the given reviews and classify the review for each feature. In this paper an aspect based opinion
mining system is proposed to classify the reviews as positive, negative and neutral for each feature.
Negation is also handled in the proposed system. Experimental results using reviews of products show the
effectiveness of the system.
Sentiment Analysis Using Hybrid Approach: A SurveyIJERA Editor
Sentiment analysis is the process of identifying people’s attitude and emotional state’s from language. The main objective is realized by identifying a set of potential features in the review and extracting opinion expressions about those features by exploiting their associations. Opinion mining, also known as Sentiment analysis, plays an important role in this process. It is the study of emotions i.e. Sentiments, Expressions that are stated in natural language. Natural language techniques are applied to extract emotions from unstructured data. There are several techniques which can be used to analysis such type of data. Here, we are categorizing these techniques broadly as ”supervised learning”, ”unsupervised learning” and ”hybrid techniques”. The objective of this paper is to provide the overview of Sentiment Analysis, their challenges and a comparative analysis of it’s techniques in the field of Natural Language Processing.
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningIRJET Journal
This document compares sentiment analysis techniques using deep learning and machine learning. It summarizes previous work using various machine learning algorithms and deep learning methods for sentiment analysis. The document then outlines the approach taken in this study, which is to determine the best sentiment analysis results using either machine learning or deep learning techniques. It describes preprocessing the Rotten Tomatoes movie review dataset and creating text matrices before selecting models for classification. The goal is to get a generalized understanding of how sentiment analysis can be performed and which practices yield optimal results.
ASPECT-BASED OPINION EXTRACTION FROM CUSTOMER REVIEWScsandit
Text is the main method of communicating information in the digital age. Messages, blogs,
news articles, reviews, and opinionated information abounds on the Internet. People commonly
purchase products online and post their opinions about purchased items. This feedback is
displayed publicly to assist others with their purchasing decisions, creating the need for a
mechanism with which to extract and summarize useful information for enhancing the decisionmaking
process. Our contribution is to improve the accuracy of extraction by combining
different techniques from three major areas, namedData Mining, Natural Language Processing
techniques and Ontologies. The proposed framework sequentially mines product’s aspects and
users’ opinions, groups representative aspects by similarity, and generates an output summary.
This paper focuses on the task of extracting product aspects and users’ opinions by extracting
all possible aspects and opinions from reviews using natural language, ontology, and frequent
“tag”sets. The proposed framework, when compared with an existing baseline model, yielded
promising results.
One fundamental problem in sentiment analysis is categorization of sentiment polarity. Given a piece of written text, the problem is to categorize the text into one specific sentiment polarity, positive or negative (or neutral). Based on the scope of the text, there are three distinctions of sentiment polarity categorization, namely the document level, the sentence level, and the entity and aspect level. Consider a review “I like multimedia features but the battery life sucks.†This sentence has a mixed emotion. The emotion regarding multimedia is positive whereas that regarding battery life is negative. Hence, it is required to extract only those opinions relevant to a particular feature (like battery life or multimedia) and classify them, instead of taking the complete sentence and the overall sentiment. In this paper, we present a novel approach to identify pattern specific expressions of opinion in text.
Summarization and opinion detection of product reviews (1)Lokesh Mittal
This document describes a project to generate summaries of product reviews. It scrapes reviews from websites, extracts features and identifies opinions as positive or negative. It uses dependency parsing to extract features and SentiWordNet to determine opinion orientation. The system generates a summary with the most common features and percentages of positive and negative opinions for each feature. Evaluation compares the extracted features and opinions to manual analysis. Future work includes improving pronoun resolution, opinion strength and other linguistic opinions.
Opinion mining of movie reviews at document levelijitjournal
The whole world is changed rapidly and using the current technologies Internet becomes an essential
need for everyone. Web is used in every field. Most of the people use web for a common purpose like
online shopping, chatting etc. During an online shopping large number of reviews/opinions are given by
the users that reflect whether the product is good or bad. These reviews need to be explored, analyse and
organized for better decision making. Opinion Mining is a natural language processing task that deals
with finding orientation of opinion in a piece of text with respect to a topic. In this paper a document
based opinion mining system is proposed that classify the documents as positive, negative and neutral.
Negation is also handled in the proposed system. Experimental results using reviews of movies show the
effectiveness of the system.
A SURVEY PAPER ON EXTRACTION OF OPINION WORD AND OPINION TARGET FROM ONLINE R...ijiert bestjournal
Opinion mining is nothing but mining opinion target s and opinion words from online reviews. To find op inion relation among them partially supervised word align ment model have used. To find confidence of each candidate graph based co-ranking algorithm have used. Further candidates having confidence higher than threshold value are extracted as opinion word or opinion targets. Compa red to previous approach syntax-based method this m ethod can give correct results by eliminating parsing errors and can work on reviews in informal language. Compa red to nearest neighbor method this method can give more p recise results and can find relations within a long span. Also to decrease error propagation graph based co-r anking algorithm is used to collectively extract op inion targets and opinion words. Also to decrease probability of error generation penetration of high degree vertice s is done and decrease effect of random walk.
"Knowing about the user’s feedback can come to a greater aid in knowing the user as well as improving the organization. Here an example of student’s data is taken for study purpose. Analyzing the student feedback will help to help to address student related problems and help to make teaching more student oriented. Prashali S. Shinde | Asmita R. Kanase | Rutuja S. Pawar | Yamini U. Waingankar ""Sentiment Analysis of Feedback Data"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | Fostering Innovation, Integration and Inclusion Through Interdisciplinary Practices in Management , March 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23090.pdf
Paper URL: https://www.ijtsrd.com/other-scientific-research-area/other/23090/sentiment-analysis-of-feedback-data/prashali--s-shinde"
INFERENCE BASED INTERPRETATION OF KEYWORD QUERIES FOR OWL ONTOLOGYIJwest
This paper presents a model for interpreting keyword queries over OWL ontologies that considers OWL axioms and restrictions to provide more precise answers to user queries. The model maps keywords from user queries to ontological elements and identifies phrases to select an appropriate SPARQL query template. The template is populated using inferred results from applying OWL restrictions to generate a formal SPARQL query for execution over the ontology. By addressing OWL features, the model aims to leverage the full capabilities of OWL knowledge bases to better understand users' information needs.
2005-Model-guided information discovery for intelligence analysis-p269-alonsoHua Li, PhD
This document describes an approach to modeling intelligence analysts to guide information discovery. The Analyst Modeling Environment (AME) dynamically models an analyst's interests using a concept map that represents concepts, relationships between concepts, and associated interest levels. The model is adapted over time based on the analyst's interactions and feedback. An evaluation compared using AME to guide searches via the model to searches using Google alone, finding that the AME approach improved precision and recall.
A Survey on Evaluating Sentiments by Using Artificial Neural NetworkIRJET Journal
This document discusses sentiment analysis using artificial neural networks. It begins with an abstract that introduces sentiment analysis and machine learning approaches used, including Naive Bayes, maximum entropy, and support vector machines. It then provides more detail on a survey of machine learning techniques for sentiment analysis, focusing on neural networks. The document proposes using a combination of neural networks and fuzzy logic to improve sentiment classification accuracy by better handling correlations between variables.
Using Hybrid Approach Analyzing Sentence Pattern by POS Sequence over TwitterIRJET Journal
This document presents a study that uses part-of-speech (POS) sequence analysis to determine sentence patterns in tweets for sentiment analysis purposes. The study extracts 2-tag and 3-tag POS sequences from tweets and uses information gain to select the top sequences. Supervised classification with support vector machines is then performed using the POS sequences as features. The results show distinguishable sentence pattern groups for positive and negative tweets, and incorporating POS sequences can improve sentiment analysis accuracy compared to using lexicons alone.
This document proposes a system to detect fake product reviews on e-commerce sites. The system uses sentiment analysis, content similarity analysis, and review deviation analysis to identify fake reviews. It extracts product reviews from websites, preprocesses the data, and uses three techniques to detect fake reviews. The fake reviews are then used to train a classifier to label new reviews as fake or genuine. The system was able to detect 111 fake reviews out of 300 with the classifier identifying an additional 18 fake reviews. The techniques aim to make online shopping reviews more trustworthy.
Feature Based Semantic Polarity Analysis Through OntologyIOSR Journals
This document summarizes a research paper that proposes an opinion mining methodology using ontologies and natural language processing techniques to perform feature-based sentiment analysis of customer reviews. It begins by collecting customer reviews from websites. The reviews are preprocessed by removing URLs, usernames, etc. and performing part-of-speech tagging to extract product features. An ontology is constructed to organize the features and their relationships. Term frequencies are calculated to determine feature importance. Sentiment scores from -5 to 5 are assigned to each feature using a sentiment analysis tool and N-gram analysis. The methodology is evaluated using precision, recall, and F-measure. The feature-level sentiment analysis provides more detailed and helpful information for customers and developers compared to document-level
Unsupervised Main Entity Extraction from News Articles using Latent VariablesJinho Choi
This document presents a methodology for semi-unsupervised main entity extraction from news articles using latent variables. It trains a semi-supervised model using only semantic and lexical information from raw text to automatically extract main entities from articles. The extracted entities are evaluated based on word sequence matches between the entities and news article titles, with the evaluation metric for this task needing improvement.
This document summarizes a research paper that proposes a method for performing sentiment analysis on product reviews to identify promising product features. It involves scraping short reviews from websites, preprocessing the text through cleaning, tokenization and part-of-speech tagging. Next, it uses pattern mining and a custom lexicon dictionary to determine the overall sentiment score and sentiment scores for specific product features. The goal is to analyze which features consumers view most positively to help businesses understand customer preferences.
IRJET- Sentimental Analysis for Students’ Feedback using Machine Learning App...IRJET Journal
This document discusses using machine learning approaches to perform sentiment analysis on students' feedback. Specifically, it proposes using a random forest classifier to analyze descriptive feedback collected through an online student portal and classify it as having positive, negative, or neutral sentiment. The proposed system would collect real-time feedback, preprocess it by removing stop words and tagging parts of speech, extract sentiment-related features, and use the trained random forest model to classify unseen feedback with 90% accuracy. The goal is to more accurately analyze both objective and descriptive feedback to evaluate teacher performance.
This document presents a statistical weighted approach for performing sentiment analysis on movie reviews to identify overall and feature-level sentiments. It involves identifying adjectives from reviews and classifying them as positive or negative based on their score in a database. Feature extraction is also performed to analyze sentiments for specific movie aspects. Individual review sentiments are identified and then aggregated to determine overall sentiment scores for movies. The approach is tested on real movie review datasets and able to accurately derive positive or negative sentiment scores for different movies based on review analysis.
Opinion mining on newspaper headlines using SVM and NLPIJECEIAES
Opinion Mining also known as Sentiment Analysis, is a technique or procedure which uses Natural Language processing (NLP) to classify the outcome from text. There are various NLP tools available which are used for processing text data. Multiple research have been done in opinion mining for online blogs, Twitter, Facebook etc. This paper proposes a new opinion mining technique using Support Vector Machine (SVM) and NLP tools on newspaper headlines. Relative words are generated using Stanford CoreNLP, which is passed to SVM using count vectorizer. On comparing three models using confusion matrix, results indicate that Tf-idf and Linear SVM provides better accuracy for smaller dataset. While for larger dataset, SGD and linear SVM model outperform other models.
IRJET- A Review on: Sentiment Polarity Analysis on Twitter Data from Diff...IRJET Journal
This document summarizes research on sentiment polarity analysis of Twitter data from different events. It discusses how Twitter data can be used for opinion mining and sentiment analysis. Several papers that used techniques like naive Bayes classifier, support vector machines, and dual sentiment analysis on Twitter data are summarized. The document also provides an overview of the key steps involved in a Twitter sentiment analysis system, including data collection, preprocessing, feature extraction, training a classification model, and evaluating accuracy. The goal of analyzing sentiments on Twitter is to understand public opinions on different topics and events.
Mining of product reviews at aspect levelijfcstjournal
Today’s world is a world of Internet, almost all work can be done with the help of it, from simple mobile
phone recharge to biggest business deals can be done with the help of this technology. People spent their
most of the times on surfing on the Web; it becomes a new source of entertainment, education,
communication, shopping etc. Users not only use these websites but also give their feedback and
suggestions that will be useful for other users. In this way a large amount of reviews of users are collected
on the Web that needs to be explored, analyse and organized for better decision making. Opinion Mining or
Sentiment Analysis is a Natural Language Processing and Information Extraction task that identifies the
user’s views or opinions explained in the form of positive, negative or neutral comments and quotes
underlying the text. Aspect based opinion mining is one of the level of Opinion mining that determines the
aspect of the given reviews and classify the review for each feature. In this paper an aspect based opinion
mining system is proposed to classify the reviews as positive, negative and neutral for each feature.
Negation is also handled in the proposed system. Experimental results using reviews of products show the
effectiveness of the system.
Sentiment Analysis Using Hybrid Approach: A SurveyIJERA Editor
Sentiment analysis is the process of identifying people’s attitude and emotional state’s from language. The main objective is realized by identifying a set of potential features in the review and extracting opinion expressions about those features by exploiting their associations. Opinion mining, also known as Sentiment analysis, plays an important role in this process. It is the study of emotions i.e. Sentiments, Expressions that are stated in natural language. Natural language techniques are applied to extract emotions from unstructured data. There are several techniques which can be used to analysis such type of data. Here, we are categorizing these techniques broadly as ”supervised learning”, ”unsupervised learning” and ”hybrid techniques”. The objective of this paper is to provide the overview of Sentiment Analysis, their challenges and a comparative analysis of it’s techniques in the field of Natural Language Processing.
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningIRJET Journal
This document compares sentiment analysis techniques using deep learning and machine learning. It summarizes previous work using various machine learning algorithms and deep learning methods for sentiment analysis. The document then outlines the approach taken in this study, which is to determine the best sentiment analysis results using either machine learning or deep learning techniques. It describes preprocessing the Rotten Tomatoes movie review dataset and creating text matrices before selecting models for classification. The goal is to get a generalized understanding of how sentiment analysis can be performed and which practices yield optimal results.
ASPECT-BASED OPINION EXTRACTION FROM CUSTOMER REVIEWScsandit
Text is the main method of communicating information in the digital age. Messages, blogs,
news articles, reviews, and opinionated information abounds on the Internet. People commonly
purchase products online and post their opinions about purchased items. This feedback is
displayed publicly to assist others with their purchasing decisions, creating the need for a
mechanism with which to extract and summarize useful information for enhancing the decisionmaking
process. Our contribution is to improve the accuracy of extraction by combining
different techniques from three major areas, namedData Mining, Natural Language Processing
techniques and Ontologies. The proposed framework sequentially mines product’s aspects and
users’ opinions, groups representative aspects by similarity, and generates an output summary.
This paper focuses on the task of extracting product aspects and users’ opinions by extracting
all possible aspects and opinions from reviews using natural language, ontology, and frequent
“tag”sets. The proposed framework, when compared with an existing baseline model, yielded
promising results.
Empirical Model of Supervised Learning Approach for Opinion MiningIRJET Journal
This summarizes an empirical model for opinion mining using supervised learning with an integrated alignment model and naive Bayesian classification model. The proposed model aims to automatically identify user reviews of products as positive or negative and provide an aggregated product rating based on review sentiment analysis and rankings. An alignment model is used to match keywords between source and target reviews to determine sentiment polarity. If a match is not found, the review is sent to a naive Bayesian classification model for sentiment analysis and rating. A rank aggregation model then considers data parameters like user ID, time, and rank to generate a ranked list of products based on ratings and sentiment analysis while excluding short-duration sessions or redundant comments. The proposed hybrid model aims to provide more accurate results for product sentiment analysis
The document presents an approach for extracting aspects and associated sentiments from user feedback data using a rule-based approach. It involves extracting aspects from sentences, associating sentiment terms to the aspects using SentiWordNet, and classifying sentiments according to linguistic rules. The approach uses part-of-speech tagging and WordNet to identify aspects and group related ones. Sentiment scores are normalized to account for intensifiers, negations, and ambiguity. The approach was tested on 65,000 responses from a hospital survey to extract and classify aspects and sentiments at the sentence level.
IRJET- Opinion Targets and Opinion Words Extraction for Online Reviews wi...IRJET Journal
The document discusses a technique for extracting opinion targets and opinion words from online reviews using sentiment analysis. It proposes using a partially supervised word alignment model (PSWAM) to identify opinion relations between words and extract candidates as targets or words. A graph-based algorithm is then used to estimate candidate confidence, and the highest confidence candidates are extracted. The technique aims to more precisely capture opinion relations compared to previous methods. Experimental results on online product reviews showed the effectiveness of the proposed approach.
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Re...Dr. Amarjeet Singh
Any E-Commerce website gets bad reputation if they
sell a product which has bad review, the user blames the eCommerce website rather than manufacturers most of the
times. In some review sites some great audits are included by
the item organization individuals itself so as to make so as to
deliver false positive item reviews. To eliminate these type of
fake product review, we will create a system that finds out the
fake reviews and eliminates all the fake reviews by using
machine learning. We also remove the reviews that are flood
by a marketing agency in order to boost up the ratings of a
particular product .Finally Sentiment analysis is done for the
genuine reviews to classify them into positive and negative.
We will use Bag-of-words to label individual words
according to their sentiment.
IRJET- Cross-Domain Sentiment Encoding through Stochastic Word EmbeddingIRJET Journal
This document discusses cross-domain sentiment encoding through stochastic word embedding. It proposes a novel method that takes advantage of stochastic embedding techniques to tackle cross-domain sentiment alignment in a simple way without complex model designs or additional learning tasks. The method encodes word polarity and occurrence information from reviews to learn representations across domains. It is benchmarked on sentiment classification tasks using two review corpora and compared to other classical and state-of-the-art methods.
Sentiment Analysis and Classification of Tweets using Data MiningIRJET Journal
This document summarizes research on using data mining techniques to perform sentiment analysis on tweets. The researchers collected tweets from Twitter and preprocessed the text to make it usable for building sentiment classifiers. They used three classifiers - K-Nearest Neighbor, Naive Bayes, and Decision Tree - and compared the results to determine which provided the best accuracy. Rapid Miner tool was used to preprocess the text, build the classifiers, and analyze the results. The goal was to determine people's sentiments expressed in their tweets and correctly classify them.
This document discusses sentiment analysis on unstructured product reviews. It begins with an introduction to sentiment analysis and opinion mining. The author then reviews related work on aspect-based sentiment analysis and feature extraction. The proposed work involves extracting features from unstructured reviews, determining sentiment polarity using SentiStrength, and classifying features using Naive Bayes. The experiment uses 575 reviews to identify prominent product aspects and determine sentiment scores. Naive Bayes classification is performed in Tanagra to obtain prior distributions of sentiment for each feature. Figures and tables are included to illustrate the process.
IRJET - Response Analysis of Educational VideosIRJET Journal
This document summarizes a research paper that analyzes student feedback on educational videos through sentiment analysis. It proposes a system to collect student comments, preprocess the data, identify sentiment and emotions, compute student satisfaction and dissatisfaction, and visualize the results. The system uses machine learning techniques like term frequency-inverse document frequency and random forest classification. It achieved 62.5% accuracy in classifying sentiment polarity in student comments. The analysis of student responses can help teachers better understand student interest and identify areas for improvement.
A Review on Sentimental Analysis of Application ReviewsIJMER
As with rapid evolution of computer technology and smart phones mobile applications
become very important part of our life. It is very difficult for customers to keep track of different
applications reviews so sentimental analysis is used. Sentimental analysis is effective and efficient
evolution of customer’s opinion in real time. Sentimental analysis for applications review is performed
two approaches statistical model based approaches and Natural Language Processing (NLP) based
approaches to create rules. Two schemes used for analyzing the textual comments- aspect level
sentimental analysis analyses the text and provide a label on each aspect then scores on multiple
aspects are aggregated and result for reviews shown in graphs. Second scheme is document level
analyses which comprising of adjectives, adverbs and verbs and n-gram feature extraction. I have also
used our SentiWordNet scheme to compute the document-level sentiment for each movie reviewed
and compared the results with results obtained using Alchemy API. The sentiment profile of a movie is
also compared with the document-level sentiment result. The results obtained show that my scheme
produces a more accurate and focused sentiment profile than the simple document-level sentiment
analysis.
This document discusses various techniques for sentiment analysis of application reviews, including both statistical and natural language processing approaches. It describes how sentiment analysis can be used to analyze textual reviews and classify them as positive or negative. Several key techniques are discussed, such as using machine learning classifiers like Naive Bayes, extracting n-grams and sentiment-oriented words, and developing rule-based models using techniques like identifying parts of speech. The document also discusses using these techniques to perform sentiment analysis at both the document and aspect levels.
The document proposes a Requirement Opinions Mining Method (ROM) to mine user requirements from software review data. It first defines requirement opinions, functional requirement opinions, and non-functional requirement opinions. It then uses deep learning models to classify reviews into functional and non-functional categories. Functional reviews are further classified into three categories and sequence labeling is used to identify functional requirements. Non-functional reviews are clustered using K-means clustering with word vectors. Finally, specific requirements are extracted from the clusters using TF-IDF and syntactic analysis to realize requirement opinion mining from software review data. A case study is conducted on reviews from a Chinese mobile application platform.
This document discusses opinion mining and sentiment analysis for business intelligence purposes. It provides an overview of related work on extracting opinions from text to classify sentiments. The paper surveys techniques like lexicon-based approaches and machine learning algorithms for sentiment classification. It also discusses how opinion mining can help business analysts extract relevant information from large amounts of unstructured data on the web to make informed decisions. Future work may involve applying techniques like neural networks and improving information retrieval from XML data sources.
Web User Opinion Analysis for Product Features Extraction and Opinion Summari...dannyijwest
Selling the product through Web has become more popular because of online shopping. This enables
merchants to sell their products through Web and expects the customer to express their opinion through
online about the product which they have purchased. Due to this we find number of customer reviews on a
particular product, it varies from hundreds to thousands, for some product it is more than that. In order to
help the customer and the manufacture/merchant we propose a semantic based approach to mine different
product features and to find the opinion summarization about each of these extracted product features by
means of web user opinion expressed through the customer reviews using typed dependency relations.
With the increasing number of online comments, it was hard for buyers to find useful information
in a short time so it made sense to do research on automatic summarization which fundamental work was
focused on product reviews mining. Previous studies mainly focused on explicit features extraction
whereas often ignored implicit features which hadn't been stated clearly but containing necessary
information for analyzing comments. So how to quickly and accurately mine features from web reviews had
important significance for summarization technology. In this paper, explicit features and “feature-opinion”
pairs in the explicit sentences were extracted by Conditional Random Field and implicit product features
were recognized by a bipartite graph model based on random walk algorithm. Then incorporating features
and corresponding opinions into a structured text and the abstract were generated based on the extraction
results. The experiment results demonstrated the proposed methods out preferred baselines.
This document summarizes research on sentiment analysis of Twitter data. It discusses how sentiment analysis can classify tweets as positive, negative, or neutral. It reviews different techniques for sentiment analysis, including machine learning approaches like Naive Bayes classifiers and lexicon-based approaches. The document also describes prior studies that have used sentiment analysis techniques to predict security attacks based on Twitter sentiment and explore improvements in classification accuracy. In general, the document outlines common methods for analyzing sentiment in social media data and highlights past applications of the analysis.
IRJET- A Survey on Graph based Approaches in Sentiment AnalysisIRJET Journal
This document summarizes research on graph-based approaches for sentiment analysis. It discusses different graph-based techniques proposed in previous studies, including using graphs to model relationships between tweets containing the same hashtag, between n-grams in documents, and between users, tweets, and features on Twitter. It also categorizes related works based on the proposed method, approach used, dataset, and limitations. The document concludes that graph-based approaches can provide higher accuracy for sentiment classification than other methods by capturing semantic relationships.
Business recommendation based on collaborative filtering and feature engineer...IJECEIAES
Business decisions for any service or product depend on sentiments by people. We get these sentiments or rating on social websites like twitter, kaggle. The mood of people towards any event, service and product are expressed in these sentiments or rating. The text of sentiment contains different linguistic features of sentence. A sentiment sentence also contains other features which are playing a vital role in deciding the polarity of sentiments. If features selection is proper one can extract better sentiments for decision making. A directed preprocessing will feed filtered input to any machine learning approach. Feature based collaborative filtering can be used for better sentiment analysis. Better use of parts of speech (POS) followed by guided preprocessing and evaluation will minimize error for sentiment polarity and hence the better recommendation to the user for business analytics can be attained.
AN EXPERIMENTAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGijscai
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features.
Similar to TOWARDS MAKING SENSE OF ONLINE REVIEWS BASED ON STATEMENT EXTRACTION (20)
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
The progressive development of Synthetic Aperture Radar (SAR) systems diversify the exploitation of the generated images by these systems in different applications of geoscience. Detection and monitoring surface deformations, procreated by various phenomena had benefited from this evolution and had been realized by interferometry (InSAR) and differential interferometry (DInSAR) techniques. Nevertheless, spatial and temporal decorrelations of the interferometric couples used, limit strongly the precision of analysis results by these techniques. In this context, we propose, in this work, a methodological approach of surface deformation detection and analysis by differential interferograms to show the limits of this technique according to noise quality and level. The detectability model is generated from the deformation signatures, by simulating a linear fault merged to the images couples of ERS1 / ERS2 sensors acquired in a region of the Algerian south.
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
A novel based a trajectory-guided, concatenating approach for synthesizing high-quality image real sample renders video is proposed . The lips reading automated is seeking for modeled the closest real image sample sequence preserve in the library under the data video to the HMM predicted trajectory. The object trajectory is modeled obtained by projecting the face patterns into an KDA feature space is estimated. The approach for speaker's face identification by using synthesise the identity surface of a subject face from a small sample of patterns which sparsely each the view sphere. An KDA algorithm use to the Lip-reading image is discrimination, after that work consisted of in the low dimensional for the fundamental lip features vector is reduced by using the 2D-DCT.The mouth of the set area dimensionality is ordered by a normally reduction base on the PCA to obtain the Eigen lips approach, their proposed approach by[33]. The subjective performance results of the cost function under the automatic lips reading modeled , which wasn’t illustrate the superior performance of the
method.
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
Universities offer software engineering capstone course to simulate a real world-working environment in which students can work in a team for a fixed period to deliver a quality product. The objective of the paper is to report on our experience in moving from Waterfall process to Agile process in conducting the software engineering capstone project. We present the capstone course designs for both Waterfall driven and Agile driven methodologies that highlight the structure, deliverables and assessment plans.To evaluate the improvement, we conducted a survey for two different sections taught by two different instructors to evaluate students’ experience in moving from traditional Waterfall model to Agile like process. Twentyeight students filled the survey. The survey consisted of eight multiple-choice questions and an open-ended question to collect feedback from students. The survey results show that students were able to attain hands one experience, which simulate a real world-working environment. The results also show that the Agile approach helped students to have overall better design and avoid mistakes they have made in the initial design completed in of the first phase of the capstone project. In addition, they were able to decide on their team capabilities, training needs and thus learn the required technologies earlier which is reflected on the final product quality
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
This document discusses using social media technologies to promote student engagement in a software project management course. It describes the course and objectives of enhancing communication. It discusses using Facebook for 4 years, then switching to WhatsApp based on student feedback, and finally introducing Slack to enable personalized team communication. Surveys found students engaged and satisfied with all three tools, though less familiar with Slack. The conclusion is that social media promotes engagement but familiarity with the tool also impacts satisfaction.
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
In real world computing environment with using a computer to answer questions has been a human dream since the beginning of the digital era, Question-answering systems are referred to as intelligent systems, that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base it can generate answers of questions asked in natural , and the first main idea of fuzzy logic was to working on the problem of computer understanding of natural language, so this survey paper provides an overview on what Question-Answering is and its system architecture and the possible relationship and
different with fuzzy logic, as well as the previous related research with respect to approaches that were followed. At the end, the survey provides an analytical discussion of the proposed QA models, along or combined with fuzzy logic and their main contributions and limitations.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
In education, the use of electronic (E) examination systems is not a novel idea, as Eexamination systems have been used to conduct objective assessments for the last few years. This research deals with randomly designed E-examinations and proposes an E-assessment system that can be used for subjective questions. This system assesses answers to subjective questions by finding a matching ratio for the keywords in instructor and student answers. The matching ratio is achieved based on semantic and document similarity. The assessment system is composed of four modules: preprocessing, keyword expansion, matching, and grading. A survey and case study were used in the research design to validate the proposed system. The examination assessment system will help instructors to save time, costs, and resources, while increasing efficiency and improving the productivity of exam setting and assessments.
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
African Buffalo Optimization (ABO) is one of the most recent swarms intelligence based metaheuristics. ABO algorithm is inspired by the buffalo’s behavior and lifestyle. Unfortunately, the standard ABO algorithm is proposed only for continuous optimization problems. In this paper, the authors propose two discrete binary ABO algorithms to deal with binary optimization problems. In the first version (called SBABO) they use the sigmoid function and probability model to generate binary solutions. In the second version (called LBABO) they use some logical operator to operate the binary solutions. Computational results on two knapsack problems (KP and MKP) instances show the effectiveness of the proposed algorithm and their ability to achieve good and promising solutions.
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
In recent years, many malware writers have relied on Dynamic Domain Name Services (DDNS) to maintain their Command and Control (C&C) network infrastructure to ensure a persistence presence on a compromised host. Amongst the various DDNS techniques, Domain Generation Algorithm (DGA) is often perceived as the most difficult to detect using traditional methods. This paper presents an approach for detecting DGA using frequency analysis of the character distribution and the weighted scores of the domain names. The approach’s feasibility is demonstrated using a range of legitimate domains and a number of malicious algorithmicallygenerated domain names. Findings from this study show that domain names made up of English characters “a-z” achieving a weighted score of < 45 are often associated with DGA. When a weighted score of < 45 is applied to the Alexa one million list of domain names, only 15% of the domain names were treated as non-human generated.
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
The document proposes a blockchain-based digital currency and streaming platform called GoMAA to address issues of piracy in the online music streaming industry. Key points:
- GoMAA would use a digital token on the iMediaStreams blockchain to enable secure dissemination and tracking of streamed content. Content owners could control access and track consumption of released content.
- Original media files would be converted to a Secure Portable Streaming (SPS) format, embedding watermarks and smart contract data to indicate ownership and enable validation on the blockchain.
- A browser plugin would provide wallets for fans to collect GoMAA tokens as rewards for consuming content, incentivizing participation and addressing royalty discrepancies by recording
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
This document discusses the importance of verb suffix mapping in discourse translation from English to Telugu. It explains that after anaphora resolution, the verbs must be changed to agree with the gender, number, and person features of the subject or anaphoric pronoun. Verbs in Telugu inflect based on these features, while verbs in English only inflect based on number and person. Several examples are provided that demonstrate how the Telugu verb changes based on whether the subject or pronoun is masculine, feminine, neuter, singular or plural. Proper verb suffix mapping is essential for generating natural and coherent translations while preserving the context and meaning of the original discourse.
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
In this paper, based on the definition of conformable fractional derivative, the functional
variable method (FVM) is proposed to seek the exact traveling wave solutions of two higherdimensional
space-time fractional KdV-type equations in mathematical physics, namely the
(3+1)-dimensional space–time fractional Zakharov-Kuznetsov (ZK) equation and the (2+1)-
dimensional space–time fractional Generalized Zakharov-Kuznetsov-Benjamin-Bona-Mahony
(GZK-BBM) equation. Some new solutions are procured and depicted. These solutions, which
contain kink-shaped, singular kink, bell-shaped soliton, singular soliton and periodic wave
solutions, have many potential applications in mathematical physics and engineering. The
simplicity and reliability of the proposed method is verified.
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
The document discusses automated penetration testing and provides an overview. It compares manual and automated penetration testing, noting that automated testing allows for faster, more standardized and repeatable tests but has limitations in developing new exploits. It also reviews some current automated penetration testing methodologies and tools, including those using HTTP/TCP/IP attacks, linking common scanning tools, a Python-based tool targeting databases, and one using POMDPs for multi-step penetration test planning under uncertainty. The document concludes that automated testing is more efficient than manual for known vulnerabilities but cannot replace manual testing for discovering new exploits.
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
Since the mid of 1990s, functional connectivity study using fMRI (fcMRI) has drawn increasing
attention of neuroscientists and computer scientists, since it opens a new window to explore
functional network of human brain with relatively high resolution. BOLD technique provides
almost accurate state of brain. Past researches prove that neuro diseases damage the brain
network interaction, protein- protein interaction and gene-gene interaction. A number of
neurological research paper also analyse the relationship among damaged part. By
computational method especially machine learning technique we can show such classifications.
In this paper we used OASIS fMRI dataset affected with Alzheimer’s disease and normal
patient’s dataset. After proper processing the fMRI data we use the processed data to form
classifier models using SVM (Support Vector Machine), KNN (K- nearest neighbour) & Naïve
Bayes. We also compare the accuracy of our proposed method with existing methods. In future,
we will other combinations of methods for better accuracy.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
The document proposes a new validation method for fuzzy association rules based on three steps: (1) applying the EFAR-PN algorithm to extract a generic base of non-redundant fuzzy association rules using fuzzy formal concept analysis, (2) categorizing the extracted rules into groups, and (3) evaluating the relevance of the rules using structural equation modeling, specifically partial least squares. The method aims to address issues with existing fuzzy association rule extraction algorithms such as large numbers of extracted rules, redundancy, and difficulties with manual validation.
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
In many applications of data mining, class imbalance is noticed when examples in one class are
overrepresented. Traditional classifiers result in poor accuracy of the minority class due to the
class imbalance. Further, the presence of within class imbalance where classes are composed of
multiple sub-concepts with different number of examples also affect the performance of
classifier. In this paper, we propose an oversampling technique that handles between class and
within class imbalance simultaneously and also takes into consideration the generalization
ability in data space. The proposed method is based on two steps- performing Model Based
Clustering with respect to classes to identify the sub-concepts; and then computing the
separating hyperplane based on equal posterior probability between the classes. The proposed
method is tested on 10 publicly available data sets and the result shows that the proposed
method is statistically superior to other existing oversampling methods.
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
Data collection is an essential, but manpower intensive procedure in ecological research. An
algorithm was developed by the author which incorporated two important computer vision
techniques to automate data cataloging for butterfly measurements. Optical Character
Recognition is used for character recognition and Contour Detection is used for imageprocessing.
Proper pre-processing is first done on the images to improve accuracy. Although
there are limitations to Tesseract’s detection of certain fonts, overall, it can successfully identify
words of basic fonts. Contour detection is an advanced technique that can be utilized to
measure an image. Shapes and mathematical calculations are crucial in determining the precise
location of the points on which to draw the body and forewing lines of the butterfly. Overall,
92% accuracy were achieved by the program for the set of butterflies measured.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
The anonymity of social networks makes it attractive for hate speech to mask their criminal
activities online posing a challenge to the world and in particular Ethiopia. With this everincreasing
volume of social media data, hate speech identification becomes a challenge in
aggravating conflict between citizens of nations. The high rate of production, has become
difficult to collect, store and analyze such big data using traditional detection methods. This
paper proposed the application of apache spark in hate speech detection to reduce the
challenges. Authors developed an apache spark based model to classify Amharic Facebook
posts and comments into hate and not hate. Authors employed Random forest and Naïve Bayes
for learning and Word2Vec and TF-IDF for feature selection. Tested by 10-fold crossvalidation,
the model based on word2vec embedding performed best with 79.83%accuracy. The
proposed method achieve a promising result with unique feature of spark for big data.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapitolTechU
Slides from a Capitol Technology University webinar held June 20, 2024. The webinar featured Dr. Donovan Wright, presenting on the Department of Defense Digital Transformation.
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxOH TEIK BIN
(A Free eBook comprising 3 Sets of Presentation of a selection of Puzzles, Brain Teasers and Thinking Problems to exercise both the mind and the Right and Left Brain. To help keep the mind and brain fit and healthy. Good for both the young and old alike.
Answers are given for all the puzzles and problems.)
With Metta,
Bro. Oh Teik Bin 🙏🤓🤔🥰
Creative Restart 2024: Mike Martin - Finding a way around “no”Taste
Ideas that are good for business and good for the world that we live in, are what I’m passionate about.
Some ideas take a year to make, some take 8 years. I want to share two projects that best illustrate this and why it is never good to stop at “no”.
2. 2 Computer Science & Information Technology (CS & IT)
Figure 1. Information extraction pipeline
statements. Next, because the same aspect can be expressed in different ways (display, screen) it
groups the different expressions of an aspect together to a broader one (e.g. display). The same is
performed for all statements provided about an aspect. In the final step, it generates a summary
about the product based on the aspects and statements. Our goal is to have a pipeline such that
product reviews from arbitrary categories can be summarized. In this work, we focus at step one
(extraction of premises) of the pipeline and leave the remaining steps for future work.
In our case a premise consists of an aspect and one or more personal statements. For instance, for
the earlier example we have the aspect display and three statements: bright, colourful and high
resolution. We assume in this work that aspects within reviews are already known and focus only
on the automatic extraction of subjective phrases. Our statement extraction method is based on
dependency parse trees. From the parse tree, we obtain generalized patterns that highlight the
boundaries of statements and link them to an aspect within a review.
Patterns generated from dependency parse trees have been already investigated for extracting
information from well-formed text [8,9,10] as well as in combination with aspect-based opinion
mining [4,11,14]. However, to the best of our knowledge such patterns have not been applied to
extract statements for given aspects.
The remainder of the paper is structured as follows. First, we take a short look at other approaches
and methods used to process reviews for information extraction. After that we introduce the data
we work with. Section 4 presents our technical solution to aspect relevant subjective phrase
extraction, followed by Section 5 describing our experimental settings. Results are described and
discussed in Section 6. We conclude our paper in Section 7.
2. RELATED WORK
Opinion mining and sentiment analysis is a wide research field and can be divided into different
areas [3]. In terms of product reviews there has been a focus on aspect-based sentiment analysis
[13,6,15]. In our work, we concentrate on aspect-based opinion mining and aim to extract
statements for given aspects rather than sentiments. On this line, the work of Sauper et al. applies
an LDA [1] model to simultaneously extract aspects and statements. Unlike us they are using
rather clean data with one aspect per sentence and by only considering argumentative sentences,
thus preemptively eliminating any noise in the data. Xu et al. also used LDA to jointly extract
aspects and sentiments, however they also limit the aspects per sentence to one and extract them
both at once. In our case the sentences can have more than one aspect as well as more than one
statement in a sentence. We also do not assume that our sentences are argumentative.
Furthermore, we apply patterns learned from dependency trees instead of LDA.
Dependency parse inspired patterns were used before in order to extract information from general
texts [8,4,10] as well as online reviews [11,15,17]. In some of these studies the patterns are
manually generated [4,11,15] and others learn them automatically from the data [8,10,17]. Fixed
3. Computer Science & Information Technology (CS & IT) 3
patterns are used both for learning or extracting aspects [11] and link aspects to statements [4].
Qiu et al. [15] apply relation patterns to find new aspects and statements. Their use of relations
patterns is quite successful, but unlike ours has the clear restrictions of static patterns. Unlike our
study generated patterns from previous studies have not been applied to extract personal
statements for given aspects but rather used, for instance, to extract entity or sentiment related
information.
Other approaches use the opposite direction, meaning that they search for aspects given certain,
ambiguous, statement. Yauris at al. [19] for example apply the methods used in [15] to extract
aspect from game reviews, however they statements are limited to adjectives only while our
statements can be whole phrases. Hu et al. [20] uses a frequency based approach to extract
aspects or features. The sentiment is given by an orientation and not the actual information like
done here. The same underlying method was later also used and enhanced by Marrese-Taylor et
al. [18] where they conducted a user study with a visual overview over the sentiment for each
aspect.
3. DATA
The raw data is taken from Amazon reviews provided by [5]. These consist in total of 142.8
million reviews from which we annotated 400 randomly selected reviews. The reviews come
from 4 different categories or representatively 4 different products with a sufficient review count.
We annotated aspects and personal statements within the reviews. Statements are defined as
certain assertions given by the reviewer. These can also be seen as a stated opinion or sentiment
about some part of a product.
The aspect describes what part or characteristic of the product is being discussed. Aspects are also
seen as an opinion target, like the ones used in [16]. All the reviews in our data were annotated by
one single expert. Altogether we found 1,666 aspects and 1,987 statements within the annotated
reviews. Among the reviews there are a few cases where the review contains only the aspect
annotation and does not convey any statement. In our application scenario, we filter out such
cases and focus only on reviews entailing both aspect and statements. The total number of
reviews containing both annotation types is 1,966. In most cases a review contains only a single
aspect and one or more statements. In this case all statements are linked to the single aspect.
However, there are also cases where a review contains more than one aspect as the following
example shows: The keyboard and trackpad of this notebook is quite sturdy but not well designed.
This example sentence contains the aspects keyboard and trackpad. The statements are quite
sturdy and not well designed. Since there are two aspects both statements are regarded as
connected to each of the aspects. We use these 400 reviews to learn patterns based on dependency
parse information. These patterns are in turn used to automatically extract subjective statements
as well as to link them to aspects.
Table 1. Annotated Data
Categories Claims Premises Relations
SD Card 333 399 396
Earphones 456 549 549
Keyboard 427 535 517
E-reader 450 504 504
all 1,666 1,987 1,966
4. 4 Computer Science & Information Technology (CS & IT)
4. METHODS
The task of extracting the complete statements is split into two successive steps. First, we identify
the position of a statement within the sentence and afterwards we limit the borders of the
statement. This limitation is needed because a statement might not be given in a single word and
can consist of a certain part of the containing sentence. Looking at the previous example
sentence: The keyboard and trackpad of this notebook is quite sturdy but not well designed, the
statements are limited to the words quite sturdy and not well designed. When retrieving only a
partial statement its meaning might be drastically altered. By excluding, for instance, the word not
in the second statement the meaning is inverted and the actual information is lost.
For learning patterns, we use dependency parse trees, which we obtain using the DKPro
framework [2], and word types (POS) for each word. Example dependency trees are shown in
figures 2 to 4. Note that for POS tags with multiple variations, like nouns, we abstracted them to
one general form. For instance, the sentences The display is bright and The displays are bright
have the noun (aspect) display described by and adjective, the statement bright. When looking at
the POS-tag of the aspects we have a tag for a singular and plural noun. Using the specific POS-
tags would generate two different patterns. To avoid this, we simply use an abstract NOUN as
word type for this node in the aspect.
Note, the information of the quantity is not needed for our purposes, as in the extraction and
outcome each of these nouns are connected to the adjective, giving us the information how each
aspect is described by its statement. This means that the correct noun, whether plural or singular
will be linked to the adjective and so the information of the quantity is still present.
with being the -th succeeding word after word in the pattern.
Figure 2. Dependency tree: The display works great.
A simple, single pattern like seen in figure 2 contains the word type of the two connected nodes
and the direction of the link d. More complex patterns, as those in figure 3
5. Computer Science & Information Technology (CS & IT) 5
and figure 4, are nested. This indicates which succeeding edges are needed to link the aspect to
the statement in these cases.
Besides describing the relationships between an aspect and statement, the patterns can also be
used to describe a statement. This allows us to minimize the complexity of our pattern as not only
to find a link between an aspect but also to extract a complete statement. The whole process of
extracting statements is divided into two steps: head identification of a statement, the linking to
an aspect, and boundary detection, the limiting of a statement. Both limiting and linking steps are
detailed in the following sections.
4.1 Limiting
The limiting of a statement defines its length and content. In order to extract only the relevant
information, we need to distinguish the relevant part of a sentence from the irrelevant ones. To
achieve this, we use the underlying dependency within a statement. A statement consists of
several words forming a logical and rhetorical structure and have one certain root node. By
determining this root node, we can extract a subtree containing all the words from the statement.
Based on this subtree we create a pattern describing the word types and dependencies of the
words within the statement. In figure 3 we can see the statement with one root node, the
noun(NN) colours. The adjective bright is linked as an adjectival modifier (amod) to this noun.
We can use these pieces of information to limit our pattern to Apart from
determining the boundaries of patterns we can also use these root nodes as a clear target for the
preceding step, the linking between an aspect and its statements which we describe in the next
section.
Figure 3. Dependency tree: The Display has bright colours.
Figure 4. Dependency tree: It has a bright and colourful display.
6. 6 Computer Science & Information Technology (CS & IT)
4.2 Linking
Patterns are also used for linking an aspect to its statements. Similar to the statement extraction
we also determine the root nodes of the aspects as a start of our patterns. For each pair of a given
aspect and extracted statement we now have two root nodes for which we extract a linking
pattern. In most aspects in the review there is only one statement given. For these cases, the path
from the aspect to the head of the statement is taken as the pattern. These cases as well as the
extracted linking patterns are shown in Figures 2 and 3. However, reviews might have complex
structures such as containing more than one subjective phrases (see Figure 4). For these cases, we
generate several patterns where each pattern captures only one path between the aspect and the
head of each existing subjective phrase.
4.3 Selecting patterns
For both of our steps we have to select the right patterns to apply. This is needed because the
extracted patterns can partially overlap each other. When looking again at the example linking
patterns from figure 2, <NN, +, VBZ>, and figure 3, <NN, +, <VBZ, -, JJ>>, we can see that
both patterns describe the first edge identical. The first pattern however ends after this edge, while
the second pattern continues with another edge. In cases where we can apply the second pattern
we could also apply the first one. Therefore, we have to prefer some patterns over others to
increase overall performance as well as to have general patterns as much as possible. To achieve
this, we use support and accuracy, as well as a combination of both computed over the patterns.
Support The support of a pattern states how often this pattern is observed. Quite common is a
linking pattern like seen in figure 3 which is extracted from the sentence: The display has bright
colours. This pattern is received from every sentence that has a sentence structure: ASPECT
VERB STATEMENT. Instead of adding a new pattern each time we increase the support of the
first pattern. The support of a limiting pattern is calculated similarly. Each occurrence of a pattern
increases the support of it.
Accuracy The accuracy of a pattern is calculated by evaluating how often a pattern can be
correctly applied in our data. When we apply a linking pattern we only know the head node of the
aspect. When we look at the patterns from figure 2 and 3 the aspect head node has the same type.
Assuming we only have those two linking patterns, we can apply the first pattern not only in the
first example but also in the second as we have the same edge from the noun to the verb. This
would result in one correct linking and one false linking and would achieve an accuracy of 0.5
For the limiting patterns, we proceed similarly.
Average accuracy and support These support and accuracy values are used to rank the patterns
in order to determine the best ones. Additionally, we propose a third ranking by averaging over
the normalized accuracy and support. The normalized accuracy and support are calculated
by
with a(p) as the accuracy, s(n) as the support and P as the set of all patterns.
7. Computer Science & Information Technology (CS & IT) 7
Threshold For the task of limiting a statement we use the best pattern (most highly ranked
pattern) to select a single statement. However, when we want to link the statements to the aspects
we have the problem that there can be multiple links per aspect and using the most highly linking
pattern does not resolve the problem. Figure 4 for example has two statements bright and
colourful. When we select only one linking pattern we can only retrieve one of the statements. To
retrieve both statements we, have to apply more than one linking pattern. We determine the
number of patterns that need to be applied using an adaptive threshold ta This threshold is
calculated by ta = rank (ph) . (1- r), where rank (ph) is the value of the highest matching pattern
and is the percentage of decline which we allow. For our linking patterns, we allow a 10%
decline in performance.
Table 2. Results of the predicted links
Ranking P R F1 P@10 P@20 P@50
LinkBaseline .43 .43 .43 - - -
Accuracy .54 .47 .50 .69 .64 .64
Support .41 .34 .37 .28 .32 .39
Acc. & Sup. .48 .44 .46 .29 .33 .38
5. EXPERIMENTAL SETTINGS
As we mentioned in Section 4 we separate our approach for extracting the statements into two
elementary steps: linking to the location of a statement and limiting the extracted statement. For
each step, we compare our results with a different straightforward and robust baseline. For
obtaining patterns, as well as for the evaluation of both steps, we use the gold standard data
described in Section 3. To evaluate the significance of our results we use a pairwise McNemar
test[12] with Bonferroni correction.
5.1 Evaluation setup
To evaluate the performance of our statement extraction we apply 10-fold cross validation. Note
that we keep in each fold only the patterns that occur at least twice. Patterns occurring less
frequently in our training set, are not used for statement extraction. This is done to eliminate
possible annotation and grammatical errors from our reviews. We compute precision, recall and
F1-measure to quantify the performance of our pattern extraction approach. Additionally, as we
can rank our retrieved patterns, we calculate the precision at 10, 20 and 50 to evaluate for the
quality of the used ranking methods.
5.2 Baseline for linking (LinkBaseline)
As a baseline for finding the statements, we extract the nearest adjective and determine if this
adjective is contained in the searched statement. This is a rather simple approach as we do not
have any means of limiting a statement based on the adjective, but it will be sufficient enough for
detecting the general area where a statement is located. For our previously chosen example from
figure 2 we assume the linking is correct if for the aspect display the adjective great is chosen as
the link target.
5.3 Baseline for limiting statement (LimitBaseline)
For the limiting step, we decided to use the dependency subtree of the root node as a baseline.
More precisely we extract every word directly or indirectly dependent from the root node as part
8. 8 Computer Science & Information Technology (CS & IT)
of the statement. This is again a quite simple baseline and therefore we allow for some noise. We
define noise as additional words retrieved in an extracted statement. For instance, for the example
sentence: The display has bright colours in figure 3 instead of only allowing the statement bright
colours for the aspect display we also allow has leading to has bright colours as the statement for
this baseline.
6. RESULTS
As previously mentioned we first look at the results of the individual steps and then regard the
performance of the whole statement extraction step.
Table 3. Performance of the statement limitation methods
Ranking P R F1 P@10 P@20 P@50
LimitBaseline .51 .48 .50 - - -
Accuracy .46 .46 .46 .64 .68 .73
Support .26 .21 .23 .64 .46 .24
Acc. & Sup. .35 .29 .32 .55 .62 .49
Table 2 shows the performance of our linking step. From the results, we see that best performance
is achieved when accuracy alone is used to rank patterns. The support ranking performs overall
worse than all the others, including the baseline. When we look at the precision at position 10, 20
and 50 we see that the accuracy ranking has only a small drop in the precision from precision@10
to precision@20. The support and acc. & sup have an increased precision for position 10 to 20,
but nevertheless they are still vastly outperformed by the accuracy ranking.
Results for limiting a statement are shown in Table 3. We evaluated only the exact matches
between the extracted statements and the gold standard. As we see from the table the baseline
performs quite well and is, overall better than our best results. The accuracy ranking outperforms
our other rankings by more than 10% in the precision, recall and F1 score. This may be contrary
to the intuition, as the support of a pattern indicates its popularity, and therefore should improve
the recall. Relying on the most frequent pattern should also receive the most correct results. The
data however shows a significantly (p value) worse performance for the support compared to the
accuracy ranking.
Table 4 shows the results for the complete extraction process (statement extraction and aspect
linking) with different noise levels. Noise, as described in chapter 5.3, is additional words
extracted along our statements. In our testing data, we have aspect and statement pairs. In the
complete extraction process we aim to determine such pairs too. If the extracted pair is correct
according to our evaluation criteria then we have a positive extraction, otherwise the extracted
pair is considered as incorrect. From the results, we see again that the accuracy performs best for
all the metrics. Contrary to the previous results, overall performance drops noticeably, from a F1
score of about .50 to only .31 for the precise results without the noise. However, when we allow
more noise, our results improve by .07 points in the precision, recall and F1 score. When
comparing the results from the different noise levels incrementally, we have a significant
improvement (p-value < .01) between each noise level step. Furthermore, we can see that our
ranked results perform quite stable with a precision of over 50%.
9. Computer Science & Information Technology (CS & IT) 9
Table 4. Results for extracting statement
Noise Ranking P R F1 P@10 P@20 P@50
Accuracy .32 .30 .31 .52 .53 .50
0 Support .09 .07 .08 .00 .05 .06
Acc. & Sup. .14 .12 .13 .00 .01 .14
Accuracy .36 .34 .35 .52 .53 .51
1 Support .09 .07 .08 .00 .05 .06
Acc. & Sup. .22 .20 .21 .00 .01 .20
Accuracy .39 .37 .38 .52 .53 .52
2 Support .10 .08 .09 .00 .05 .06
Acc. & Sup. .24 .22 .23 .00 .01 .22
6.1 Discussion
Our results show that ranking of the pattern has an enormous influence on the performance of the
extraction methods. The large performance drop between the separate steps and the complete
extraction indicates that, although the individual patterns perform rather good, the selection of the
correct pair of patterns can be improved. Increasing the noise level in the statements largely
improves our results. It shows us however that improving the patterns and their selection could
lead to further improvement as either the link is not complete or the patterns are too vague for a
better extraction. Either way, this shows that there is room for improvement.
On this line we performed an error analysis. We manually inspected statements which were
extracted by our patterns. Table 5 shows some of these statements. Most of the shorter statements,
with one or two words, are correct and even the longest and most complex one is extracted
completely. Some extracted statements like great and reasonable for the aspect price were most
likely extracted by the wrong pattern. The whole sentence is the following: Besides that this card
is great and very reasonable price of $50. The statement great references to the aspect card, but
without the knowledge about the first part of the sentence this statement could also be related to
the aspect price.
Another area that requires further Sattention is the problem with erroneous reviews. We have seen
several reviews that were problematic and yielded wrong dependency parse trees. We aim to
implement detection methods for these erroneous cases, so that we can exclude them from
processing.
7. CONCLUSION & FUTURE WORK
In this work, we described the extraction of aspect-based statements from product reviews
through patterns extracted from dependency parse trees. We introduced methods for identifying
the head of a statement and detecting the boundary for the statement given the head. Our
evaluation results show that the best method for choosing reliable patterns in both steps
separately, as well as at once, is the accuracy of the pattern.
Above, we already discussed some venues for improvement. In addition to these we also want to
tackle the automatic extraction of aspects. Finally, we aim to use the aspects as well as all their
assigned statements to generate summaries. Such summaries can be used by customers to satisfy
their information needs and help them in their decision making purposes.
10. 10 Computer Science & Information Technology (CS & IT)
Table 5. Example extracted subjective phrases
Aspect Extracted statement Correct statement
price
fair fair
very reasonable very reasonable
great and reasonable very reasonable
low low
n’t beat the price cant’t beat
price matches quality Matches the quality well
battery life
really good really good
lasted through the movie and several
episodes of a tv show
lasted through the movie and several
episodes of a tv show
awesome awesome
ACKNOWLEDGEMENTS
This work was supported by the Deutsche Forschungsgemeinschaft (DFG) under grant
No. GRK 2167, Research Training Group ”User-Centred Social Media”.
REFERENCES
[1] Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of machine Learning research
3(Jan), 993–1022 (2003)
[2] Eckart de Castilho, R., Gurevych, I.: A broad-coverage collection of portable nlp components for
building shareable analysis pipelines. In: Proceedings of the Workshop on Open Infrastructures and
Analysis Frameworks for HLT. pp. 1–11. Association for Computational Linguistics and Dublin City
University, Dublin, Ireland (August 2014), http://www.aclweb.org/anthology/W14-5201
[3] Chhabra, S., Bedathurb, S.: Summarizing entities: A survey report
[4] Gindl, S., Weichselbraun, A., Scharl, A.: Rule-based opinion target and aspect extraction to acquire
affective knowledge. In: Proceedings of the 22nd International Conference onWorld Wide Web. pp.
557–564. ACM (2013)
[5] He, R., McAuley, J.: Ups and downs: Modeling the visual evolution of fashion trends with one-class
collaborative filtering. CoRR abs/1602.01585 (2016), http://arxiv.org/abs/1602.01585
[6] Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 168–177. KDD
’04, ACM, New York, NY, USA (2004), http://doi.acm.org/10.1145/1014052.1014073
[7] Kiritchenko, S., Zhu, X., Cherry, C., Mohammad, S.: Nrc-canada-2014: Detecting aspects and
sentiment in customer reviews. In: Proceedings of the 8th International Workshop on Semantic
Evaluation (SemEval 2014). pp. 437–442. Association for Computational Linguistics and Dublin City
University, Dublin, Ireland (August 2014), http://www.aclweb.org/anthology/S14-2076
[8] Li, P., Jiang, J., Wang, Y.: Generating templates of entity summaries with an entity-aspect model and
pattern mining. In: Proceedings of the 48th annual meeting of the Association for Computational
Linguistics. pp. 640–649. Association for Computational Linguistics (2010)
[9] Li, P., Wang, Y., Jiang, J.: Automatically building templates for entity summary construction.
Information Processing and Management 49(1), 330 – 340 (2013),
http://www.sciencedirect.com/science/article/pii/S0306457312000568
11. Computer Science & Information Technology (CS & IT) 11
[10] Lippi, M., Torroni, P.: Context-independent claim detection for argument mining. In: Proceedings of
the 24th International Conference on Artificial Intelligence. pp. 185–191. IJCAI’15, AAAI Press
(2015), http://dl.acm.org/citation.cfm?id=2832249.2832275
[11] Maharani,W.,Widyantoro, D.H., Khodra, M.L.: Aspect extraction in customer reviews using syntactic
pattern. Procedia Computer Science 59, 244–253 (2015)
[12] McNemar, Q.: Note on the sampling error of the difference between correlated proportions or
percentages. Psychometrika 12(2), 153–157 (Jun 1947), https://doi.org/10.1007/BF02295996
[13] Moghaddam, S., Ester, M.: Opinion digger: An unsupervised opinion miner from unstructured
product reviews. In: Proceedings of the 19th ACM International Conference on Information and
Knowledge Management. pp. 1825–1828. CIKM ’10, ACM, New York, NY, USA (2010),
http://doi.acm.org/10.1145/1871437.1871739
[14] Moghaddam, S., Ester, M.: On the design of lda models for aspect-based opinion mining. In:
Proceedings of the 21st ACM international conference on Information and knowledge management.
pp. 803–812. ACM (2012)
[15] Qiu, G., Liu, B., Bu, J., Chen, C.: Opinion word expansion and target extraction through double
propagation. Computational linguistics 37(1), 9–27 (2011)
[16] YING, D., Yu, J., Jiang, J.: Recurrent neural networks with auxiliary labels for cross-domain opinion
target extraction (2017)
[17] Zhuang, L., Jing, F., Zhu, X.Y.: Movie review mining and summarization. In: Proceedings of the 15th
ACM International Conference on Information and Knowledge Management. pp. 43–50. CIKM ’06,
ACM, New York, NY, USA (2006), http://doi.acm.org/10.1145/1183614.1183625
[18] Edison Marrese-Taylor, Juan D. Velásquez, Felipe Bravo-Marquez, A novel deterministic approach
for aspect-based opinion mining in tourism products reviews, In Expert Systems with Applications,
Volume 41, Issue 17, 2014, Pages 7764-7775, ISSN 0957-4174,
https://doi.org/10.1016/j.eswa.2014.05.045.
[19] K. Yauris and M. L. Khodra, "Aspect-based summarization for game review using double
propagation," 2017 International Conference on Advanced Informatics, Concepts, Theory, and
Applications (ICAICTA), Denpasar, Indonesia, 2017, pp. 1-6. doi:10.1109/ICAICTA.2017.8090997,
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8090997&isnumber=8090950
[20] Hu, M., & Liu, B. (2004, July). Mining opinion features in customer reviews. In AAAI (Vol. 4, No. 4,
pp. 755-760).
[21] Xu, X., Cheng, X., Tan, S., Liu, Y., & Shen, H. (2013). Aspect-level opinion mining of online
customer reviews. China Communications, 10(3), 25-41.
12. 12 Computer Science & Information Technology (CS & IT)
AUTHORS
M.Sc. Michael Rist,
Research & Teaching assistant,
Workgroup Information Engineering,
Department of Computer Science and Applied Cognitive Science
University Duisburg-Essen
Dr.Ahmet Aker,
Research & Teaching assistant,
Workgroup Information Engineering,
Department of Computer Science and Applied Cognitive Science
University Duisburg-Essen
Prof. Dr. Norbert Fuhr,
Full Professor,
Workgroup Information Engineering,
Department of Computer Science and Applied Cognitive Science
University Duisburg-Essen