This document summarizes a study that performed sentiment analysis on tweets related to community development programs in Bogor, Indonesia. The researchers collected over 2,000 tweets about two youth awareness activities. They preprocessed the tweets, reduced their features using PCA, and classified the sentiment of each tweet as positive, negative, or neutral using a support vector machine (SVM) model. The SVM was trained using a lexicon-based labeling method. Their results showed that the model provided a sentiment summary that identified the tweets with the most positive sentiment, allowing for evaluation of which activities had a higher success rate according to social media responses.
IRJET- Real Time Sentiment Analysis of Political Twitter Data using Machi...IRJET Journal
This document summarizes a research paper that analyzed sentiments of political tweets related to the Ayodhya issue in India using machine learning. It collected tweets using keywords and preprocessed them by removing URLs, usernames, stop words, and irrelevant data. It then extracted sentiment-bearing words as features. It classified the polarity of each tweet as positive, negative, or neutral using the Vader sentiment analysis tool and calculated overall sentiment scores. It aimed to analyze public opinion on the Ayodhya issue expressed on Twitter.
Analyzing sentiment system to specify polarity by lexicon-basedjournalBEEI
Currently, sentiment analysis into positive or negative getting more attention from the researchers. With the rapid development of the internet and social media have made people express their views and opinion publicly. Analyzing the sentiment in people views and opinion impact many fields such as services and productions that companies offer. Movie reviewer needs many processing to be prepared to detect emotion, classify them and achieve high accuracy. The difficulties arise due of the structure and grammar of the language and manage the dictionary. We present a system that assigns scores indicating positive or negative opinion to each distinct entity in the text corpus. Propose an innovative formula to compute the polarity score for each word occurring in the text and find it in positive dictionary or negative dictionary we have to remove it from text. After classification, the words are stored in a list that will be used to calculate the accuracy. The results reveal that the system achieved the best results in accuracy of 76.585%.
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIAIJCSES Journal
Nowadays, internet has changed the world into a global village. Social Media has reduced the gaps among
the individuals. Previously communication was a time consuming and expensive task between the people.
Social Media has earned fame because it is a cheaper and faster communication provider. Besides, social
media has allowed us to reduce the gaps of physical distance, it also generates and preserves huge amount
of data. The data are very valuable and it presents association degree between people and their opinions.The comprehensive analysis of the methods which are used on user behavior prediction is presented in this paper. This comparison will provide a detailed information, pros and cons in the domain of sentiment and
opinion mining.
Predicting depression using deep learning and ensemble algorithms on raw twit...IJECEIAES
Social network and microblogging sites such as Twitter are widespread amongst all generations nowadays where people connect and share their feelings, emotions, pursuits etc. Depression, one of the most common mental disorder, is an acute state of sadness where person loses interest in all activities. If not treated immediately this can result in dire consequences such as death. In this era of virtual world, people are more comfortable in expressing their emotions in such sites as they have become a part and parcel of everyday lives. The research put forth thus, employs machine learning classifiers on the twitter data set to detect if a person’s tweet indicates any sign of depression or not.
Identifying e learner’s opinion using automated sentiment analysis in e-learningeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IRJET - Social Network Stress Analysis using Word Embedding TechniqueIRJET Journal
The document describes a proposed system for predicting stress levels of social network users by analyzing their comments on platforms like Facebook, Twitter, and Instagram. It uses word embedding techniques to translate words from comments into vectors, and then applies decision tree text classification on the vectors to predict whether the comments convey positive, negative, or neutral sentiment. The predicted sentiment analysis is then used to determine an overall stress status for each user. The system is intended to help identify stress early by analyzing what users share on social media, before depression or other issues arise. It aims to provide an easy-to-use web application and dashboard for viewing predicted stress statuses of social network users.
Evaluating the Impact of Gamification in High School Library Media CentersAriel Dagan
Creating behavioral change in approach to reading habits by High School students might be stimulated by extrinsic motivators that through this process become intrinsic and habit forming.
A Novel Frame Work System Used In Mobile with Cloud Based Environmentpaperpublications3
Abstract: Recent era efforts have been taken in the field of social based Question and Answer (Q&A) which is used to search the answers for the non – factorial questions. But traditional search engines like Google, Bing is used to answer only for the factorial questions where we can get direct answer from the data base servers. The web search engine for the (Q&A) system does not dependent on the broadcasting methods and centralized server for identifying friends on the social network. The problem is recovered by using mobile Q&A system in that mobile nodes are help full for accessing internet because these techniques are used to generate low node overload, higher server bandwidth cost and highest cost of mobile internet access. Lately technical experts proposed a new method called Distributed Social – Based Mobile Q&A system (SOS) which makes very faster and quicker responses to the asker. SOS enables the mobile user’s to forward the question in the decentralized manner in order get effective, capable, and potential answers from the users. SOS is the light weighted knowledge engineering technique which is used find correct person who ready and willing to answer questions hence this type of search are used reduce searching time and computational cost of the mobile nodes. In this paper we proposed a new method called mobile Q&A system in the cloud based environment through which the data has been as been transmitted form cloud server to the centralized server at any time.
IRJET- Real Time Sentiment Analysis of Political Twitter Data using Machi...IRJET Journal
This document summarizes a research paper that analyzed sentiments of political tweets related to the Ayodhya issue in India using machine learning. It collected tweets using keywords and preprocessed them by removing URLs, usernames, stop words, and irrelevant data. It then extracted sentiment-bearing words as features. It classified the polarity of each tweet as positive, negative, or neutral using the Vader sentiment analysis tool and calculated overall sentiment scores. It aimed to analyze public opinion on the Ayodhya issue expressed on Twitter.
Analyzing sentiment system to specify polarity by lexicon-basedjournalBEEI
Currently, sentiment analysis into positive or negative getting more attention from the researchers. With the rapid development of the internet and social media have made people express their views and opinion publicly. Analyzing the sentiment in people views and opinion impact many fields such as services and productions that companies offer. Movie reviewer needs many processing to be prepared to detect emotion, classify them and achieve high accuracy. The difficulties arise due of the structure and grammar of the language and manage the dictionary. We present a system that assigns scores indicating positive or negative opinion to each distinct entity in the text corpus. Propose an innovative formula to compute the polarity score for each word occurring in the text and find it in positive dictionary or negative dictionary we have to remove it from text. After classification, the words are stored in a list that will be used to calculate the accuracy. The results reveal that the system achieved the best results in accuracy of 76.585%.
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIAIJCSES Journal
Nowadays, internet has changed the world into a global village. Social Media has reduced the gaps among
the individuals. Previously communication was a time consuming and expensive task between the people.
Social Media has earned fame because it is a cheaper and faster communication provider. Besides, social
media has allowed us to reduce the gaps of physical distance, it also generates and preserves huge amount
of data. The data are very valuable and it presents association degree between people and their opinions.The comprehensive analysis of the methods which are used on user behavior prediction is presented in this paper. This comparison will provide a detailed information, pros and cons in the domain of sentiment and
opinion mining.
Predicting depression using deep learning and ensemble algorithms on raw twit...IJECEIAES
Social network and microblogging sites such as Twitter are widespread amongst all generations nowadays where people connect and share their feelings, emotions, pursuits etc. Depression, one of the most common mental disorder, is an acute state of sadness where person loses interest in all activities. If not treated immediately this can result in dire consequences such as death. In this era of virtual world, people are more comfortable in expressing their emotions in such sites as they have become a part and parcel of everyday lives. The research put forth thus, employs machine learning classifiers on the twitter data set to detect if a person’s tweet indicates any sign of depression or not.
Identifying e learner’s opinion using automated sentiment analysis in e-learningeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IRJET - Social Network Stress Analysis using Word Embedding TechniqueIRJET Journal
The document describes a proposed system for predicting stress levels of social network users by analyzing their comments on platforms like Facebook, Twitter, and Instagram. It uses word embedding techniques to translate words from comments into vectors, and then applies decision tree text classification on the vectors to predict whether the comments convey positive, negative, or neutral sentiment. The predicted sentiment analysis is then used to determine an overall stress status for each user. The system is intended to help identify stress early by analyzing what users share on social media, before depression or other issues arise. It aims to provide an easy-to-use web application and dashboard for viewing predicted stress statuses of social network users.
Evaluating the Impact of Gamification in High School Library Media CentersAriel Dagan
Creating behavioral change in approach to reading habits by High School students might be stimulated by extrinsic motivators that through this process become intrinsic and habit forming.
A Novel Frame Work System Used In Mobile with Cloud Based Environmentpaperpublications3
Abstract: Recent era efforts have been taken in the field of social based Question and Answer (Q&A) which is used to search the answers for the non – factorial questions. But traditional search engines like Google, Bing is used to answer only for the factorial questions where we can get direct answer from the data base servers. The web search engine for the (Q&A) system does not dependent on the broadcasting methods and centralized server for identifying friends on the social network. The problem is recovered by using mobile Q&A system in that mobile nodes are help full for accessing internet because these techniques are used to generate low node overload, higher server bandwidth cost and highest cost of mobile internet access. Lately technical experts proposed a new method called Distributed Social – Based Mobile Q&A system (SOS) which makes very faster and quicker responses to the asker. SOS enables the mobile user’s to forward the question in the decentralized manner in order get effective, capable, and potential answers from the users. SOS is the light weighted knowledge engineering technique which is used find correct person who ready and willing to answer questions hence this type of search are used reduce searching time and computational cost of the mobile nodes. In this paper we proposed a new method called mobile Q&A system in the cloud based environment through which the data has been as been transmitted form cloud server to the centralized server at any time.
A Survey on Sentiment Analysis and Opinion MiningIJSRD
In Today’s world, the social media has given web users a place for expressing and sharing their thoughts and opinions on different topics or events. For this purpose, the opinion mining has gained the importance. Sentiment classification and Opinion Mining is the study of people’s opinion, emotions, attitude towards the product, services, etc. Sentiment Analysis and Opinion Mining are the two interchangeable terms. There are various approaches and techniques exist for Sentiment Analysis like Naïve Bayes, Decision Trees, Support Vector Machines, Random Forests, Maximum Entropy, etc. Opinion mining is a useful and beneficial way to scientific surveys, political polls, market research and business intelligence, etc. This paper presents a literature review of various techniques used for opinion mining and sentiment analysis.
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
The document describes a proposed model for sentiment analysis of movie reviews using natural language processing and machine learning approaches. The model first applies various data pre-processing techniques to the dataset, including tokenization, pruning, filtering tokens, and stemming. It then investigates the performance of classifiers like Naive Bayes and SVM combined with different feature selection schemes, including term occurrence, binary term occurrence, term frequency and TF-IDF. Experiments are run using n-grams up to 4-grams to determine the best approach for sentiment analysis.
Scalable recommendation with social contextual informationeSAT Journals
Abstract Recommender systems are used to achieve effective and useful results in a social networks. The social recommendation will provide a social network structure but it is challenging to fuse social contextual factors which are derived from user’s motivation of social behaviors into social recommendation. Here, we introduce two contextual factors in recommender systems which are used to adopt a useful results namely a) individual preference and b) interpersonal influence. Individual preference analyze the social interests of an item content with user’s interest and adopt only users recommended results. Interpersonal influence is analyzing user-user interaction and their specific social relations. Beyond this, we propose a novel probabilistic matrix factorization method to fuse them in a latent space. The scalable algorithm provides a useful results by analyzing the ranking probability of each user social contextual information and also incrementally process the contextual data in large datasets. Keywords: social recommendation, individual preference, interpersonal influence, matrix factorization
A Survey on Sentiment Categorization of Movie ReviewsEditor IJMTER
Sentiment categorization is a process of mining user generated text content and determine
the sentiment of the users towards that particular thing. It is the approach of detecting the sentiment of
the author in regard to some topics. It also known as sentiment detection, sentiment analysis and opinion
mining. It is very useful for movie production companies that interested in knowing how users feel
about their movies. For example word “excellent” indicates that the review gives positive emotion about
particular movie. The same applies to movies, songs, cars, holiday destinations, Political parties, social
network sites, web blogs, discussion forum and so on. Sentiment categorization can be carried out by
using three approaches. First, Supervised machine learning based text classifier on Naïve Bayes,
Maximum Entropy, SVM, kNN classifier, hidden marcov model. Second, Unsupervised Semantic
Orientation scheme of extracting relevant N-grams of the text and then labelling. Third, SentiWordNet
based publicly available library.
IRJET- Social Network Mental Disorders Detection Via Online Social Media MiningIRJET Journal
This document summarizes a research paper about detecting social network mental disorders via online social media mining. The paper proposes a system to analyze user posts and interactions on social media like Facebook to detect signs of mental disorders like information overload or cyber-relationship addiction. The system uses techniques like sentiment analysis of posts using CNNs and classification of user stress levels using TSVM. If a user is detected as having a mental disorder, the system can recommend nearby hospitals on a map and send the user precautionary information to avoid disorders and promote well-being. The goal is to help detect mental disorders earlier through social media analysis to enable timely clinical intervention.
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELINGAndry Alamsyah
1. The document presents a case study analyzing tweets about Uber using sentiment analysis and topic modeling to understand public opinion from large-scale social media data.
2. Sentiment analysis classified tweets as positive, negative, or neutral, while topic modeling identified dominant topics of discussion, like promotions or driver complaints.
3. The analyses found that positive tweets often discussed promotions while negative tweets addressed issues like sexual harassment allegations or unsatisfactory drivers.
Political prediction analysis using text mining and deep learningVishwambhar Deshpande
We have proposed a system to determine current sentiment on twitter using Twit-
ter API for open access which includes opinions from dierent content structures like
latest news, audits, articles and social media posts. and Deep Learning method to
study Historic Data for predicting future results. we utilized Naive Bayes and dictio-
nary based algorithms to predict the sentiment on Live Twitter Data.
Framework for opinion as a service on review data of customer using semantics...IJECEIAES
At opinion mining plays a significant role in representing the original and unbiased perception of the products/services. However, there are various challenges associated with performing an effective opinion mining in the present era of distributed computing system with dynamic behaviour of users. Existing approaches is more laborious towards extracting knowledge from the reviews of user which is further subjected to various rounds of operation with complex procedures. The proposed system addresses the problem by introducing a novel framework called as opinion-as-a-service which is meant for direct utilization of the extracted knowledge in most user friendly manner. The proposed system introduces a set of three sequential algorithm that performs aggregated of incoming stream of opinion data, performing indexing, followed by applying semantics for extracting knowledge. The study outcome shows that proposed system is better than existing system in mining performance.
Depression and anxiety detection through the Closed-Loop method using DASS-21TELKOMNIKA JOURNAL
The change of information and communication technology has brought many changes in daily
life. The way humans interacting is changing. It is possible to express each form of communication directly
and instantly. Social media has contributed data in size, diversity and capacity and quality. Based on it,
the idea was to see and measure the tendency of depression and anxiety through social media using
the Closed-Loop method using Facebook text mining posts. Through the stages of pre-processing
including text extraction using the Naïve Bayes machine learning model for text classification, the early
signs of depression and anxiety are measured using DASS-21 parameter. In total, 22,934 Facebook posts
were contributed as training and learning data collected from July 2017 until July 2018. As a results,
analysis and mapping of social demographics of users that are usually as a trigger of depression, and
anxiety, such as grief, illness, household affairs, children education and others are available.
DEEP LEARNING SENTIMENT ANALYSIS OF AMAZON.COM REVIEWS AND RATINGSijscai
The document summarizes research on using deep learning techniques for sentiment analysis of Amazon product reviews and ratings. Specifically, it trains recurrent neural networks using paragraph vectors of reviews to learn product embeddings that capture temporal relationships between reviews. This helps identify mismatches between highly positive/negative reviews and low/high ratings. A web service applies the model to reviews and warns users if the predicted sentiment differs from their given rating.
The document describes research into creating realistic dialog between an agent and a user. It introduces a scenario where an agent assists a user in cooking by guiding them through recipe steps and solving problems. The goal is to use BDI concepts like goals, beliefs and status information to structure the recipe procedure and dialog flow. An example agent implementation is created for the cooking scenario to test approaches to the different phases of recipe selection, planning, instruction and finalization. The results show how BDI elements can support realistic multi-turn dialog between an agent and user.
Provide individualized suggestions
of data or products related to users’ needs
by Recommender systems (RSs). Even
if RSs have created substantial progresses
in theory and formula development and
have achieved many business successes, a
way to operate the wide accessible info in
online social Networks (OSNs) has been
mainly overlooked. Noticing such a gap in
the existing research in RSs and taking
into account a user’s choice being greatly
influenced by his/her trustworthy friends
and their opinions; this paper proposes a,
Fact Finder technique that improves the
prevailing recommendation approaches by
exploring a new source of data from
friends’ short posts in microbloggings as
micro-reviews.Degree of friends’
sentiment and level being sure to a user’s
choice are known by victimisation
machine learning strategies as well as
Naive Bayes, Logistic Regression and
Decision Trees. As the verification of the
proposed Fact finder, experiments
victimisation real social data from Twitter
microblogger area unit given and results
show the effectiveness and promising of
the planned approach.
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERINGijaia
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety,
Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...Wookjae Maeng
There is much recent work on using the digital footprints left by people on social media to predict personal traits and gain a deeper understanding of individuals. Due to the veracity of social media, imperfections in prediction algorithms, and the sensitive nature of one’s personal traits, much research is still needed to better understand the effectiveness of this line of work, including users’ preferences of sharing their com- putationally derived traits. In this paper, we report a two- part study involving 256 participants, which (1) examines the feasibility and effectiveness of automatically deriving three types of personality traits from Twitter, including Big 5 per- sonality, basic human values, and fundamental needs, and (2) investigates users’ opinions of using and sharing these traits. Our findings show there is a potential feasibility of automati- cally deriving one’s personality traits from social media with various factors impacting the accuracy of models. The re- sults also indicate over 61.5% users are willing to share their derived traits in the workplace and that a number of factors significantly influence their sharing preferences. Since our findings demonstrate the feasibility of automatically infer- ring a user’s personal traits from social media, we discuss their implications for designing a new generation of privacy- preserving, hyper-personalized systems.
This document summarizes a research paper that proposes a novel approach for dynamic personalized recommendation. It utilizes information from user ratings and profiles to develop dynamic features that describe user preferences over multiple phases of interest. An adaptive weighting algorithm then makes recommendations by weighting these dynamic features based on the amount of rating data available. The proposed approach was tested on public datasets and performed well for dynamic recommendation compared to existing algorithms.
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET Journal
1. The document proposes a framework called SociRank to identify and rank prevalent news topics using social media factors.
2. SociRank identifies topics prevalent in both social media and news media, and then ranks them based on their media focus in news, user attention in social media, and user interaction regarding the topic.
3. The experiments show that SociRank improves the quality and variety of automatically identified news topics compared to other topic identification and ranking methods.
LAK2011: 1st International Conference on Learning Analytics and Knowledge February 27-March 1, 2011
Banff, Alberta
Anna De Liddo, Simon Buckingham Shum,
Ivana Quinto, Michelle Bachler, Lorella Cannavacciuolo
This document discusses using Twitter data for sentiment analysis and influence tracking. It describes how Twitter data was collected using its APIs and preprocessed by removing links, usernames and stopwords. N-grams and part-of-speech tags were then extracted as features from the tweets. The tweets were classified into positive, negative, neutral or irrelevant categories. Sentiment analysis was performed at the entity level to determine sentiment towards specific topics mentioned in tweets, like products. Influence was tracked using algorithms that rank users based on retweets, followers and mentions.
This document reviews research on predicting personality from Twitter users' tweets using machine learning algorithms. It discusses how tweets have attracted research interest from diverse fields. Different techniques have been used to predict personality from tweets, but there are still shortcomings to address. The aim is to consider the current state of this research area and explore personality prediction from tweets by reviewing past literature and discussing approaches to issues researchers face. It provides an overview of machine learning methods used for personality prediction from tweets, including data collection, preprocessing, model training and evaluation.
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...IJDKP
The social networking sites have brought a new horizon for expressing views and opinions of individuals.
Moreover, they provide medium to students to share their sentiments including struggles and joy during the
learning process. Such informal information has a great venue for decision making. The large and growing
scale of information needs automatic classification techniques. Sentiment analysis is one of the automated
techniques to classify large data. The existing predictive sentiment analysis techniques are highly used to
classify reviews on E-commerce sites to provide business intelligence. However, they are not much useful
to draw decisions in education system since they classify the sentiments into merely three pre-set
categories: positive, negative and neutral. Moreover, classifying the students’ sentiments into positive or
negative category does not provide deeper insight into their problems and perks. In this paper, we propose
a novel Hybrid Classification Algorithm to classify engineering students’ sentiments. Unlike traditional
predictive sentiment analysis techniques, the proposed algorithm makes sentiment analysis process
descriptive. Moreover, it classifies engineering students’ perks in addition to problems into several
categories to help future students and education system in decision making.
Big five personality prediction based in Indonesian tweets using machine lea...IJECEIAES
The popularity of social media has drawn the attention of researchers who have conducted cross-disciplinary studies examining the relationship between personality traits and behavior on social media. Most current work focuses on personality prediction analysis of English texts, but Indonesian has received scant attention. Therefore, this research aims to predict user’s personalities based on Indonesian text from social media using machine learning techniques. This paper evaluates several machine learning techniques, including naive Bayes (NB), K-nearest neighbors (KNN), and support vector machine (SVM), based on semantic features including emotion, sentiment, and publicly available Twitter profile. We predict the personality based on the Big Five personality model, the most appropriate model for predicting user personality in social media. We examine the relationships between the semantic features and the Big Five personality dimensions. The experimental results indicate that the Big Five personality exhibit distinct emotional, sentimental, and social characteristics and that SVM outperformed NB and KNN for Indonesian. In addition, we observe several terms in Indonesian that specifically refer to each personality type, each of which has distinct emotional, sentimental, and social features.
There are various online networking sites such as Facebook, twitter where students casually discuss their educational
experiences, their opinions, emotions, and concerns about the learning process. Information from such open environment can
give valuable knowledge for opinions, emotions and help the educational organizations to get insight into students’ educational
life. Analysing down such data, on the other hand, can be challenging therefore a qualitative research and significant data
mining process needs to be done. Sentiment classification can be done using NLP (Natural Language Processing). For a social
network that provides micro blogging services such as twitter, the incoming tweets can be classified into News, Opinions,
Events, Deals and private Messages based on authors information available in the tweets. This approach is similar to
Tweetstand, which classifies the tweets into news and non-news. Even for e-commerce applications virtual customer
environments can be created using social networking sites. Since the data is ever growing, using data mining techniques can get
difficult, hence we can use data analysis tools
A Survey on Sentiment Analysis and Opinion MiningIJSRD
In Today’s world, the social media has given web users a place for expressing and sharing their thoughts and opinions on different topics or events. For this purpose, the opinion mining has gained the importance. Sentiment classification and Opinion Mining is the study of people’s opinion, emotions, attitude towards the product, services, etc. Sentiment Analysis and Opinion Mining are the two interchangeable terms. There are various approaches and techniques exist for Sentiment Analysis like Naïve Bayes, Decision Trees, Support Vector Machines, Random Forests, Maximum Entropy, etc. Opinion mining is a useful and beneficial way to scientific surveys, political polls, market research and business intelligence, etc. This paper presents a literature review of various techniques used for opinion mining and sentiment analysis.
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
The document describes a proposed model for sentiment analysis of movie reviews using natural language processing and machine learning approaches. The model first applies various data pre-processing techniques to the dataset, including tokenization, pruning, filtering tokens, and stemming. It then investigates the performance of classifiers like Naive Bayes and SVM combined with different feature selection schemes, including term occurrence, binary term occurrence, term frequency and TF-IDF. Experiments are run using n-grams up to 4-grams to determine the best approach for sentiment analysis.
Scalable recommendation with social contextual informationeSAT Journals
Abstract Recommender systems are used to achieve effective and useful results in a social networks. The social recommendation will provide a social network structure but it is challenging to fuse social contextual factors which are derived from user’s motivation of social behaviors into social recommendation. Here, we introduce two contextual factors in recommender systems which are used to adopt a useful results namely a) individual preference and b) interpersonal influence. Individual preference analyze the social interests of an item content with user’s interest and adopt only users recommended results. Interpersonal influence is analyzing user-user interaction and their specific social relations. Beyond this, we propose a novel probabilistic matrix factorization method to fuse them in a latent space. The scalable algorithm provides a useful results by analyzing the ranking probability of each user social contextual information and also incrementally process the contextual data in large datasets. Keywords: social recommendation, individual preference, interpersonal influence, matrix factorization
A Survey on Sentiment Categorization of Movie ReviewsEditor IJMTER
Sentiment categorization is a process of mining user generated text content and determine
the sentiment of the users towards that particular thing. It is the approach of detecting the sentiment of
the author in regard to some topics. It also known as sentiment detection, sentiment analysis and opinion
mining. It is very useful for movie production companies that interested in knowing how users feel
about their movies. For example word “excellent” indicates that the review gives positive emotion about
particular movie. The same applies to movies, songs, cars, holiday destinations, Political parties, social
network sites, web blogs, discussion forum and so on. Sentiment categorization can be carried out by
using three approaches. First, Supervised machine learning based text classifier on Naïve Bayes,
Maximum Entropy, SVM, kNN classifier, hidden marcov model. Second, Unsupervised Semantic
Orientation scheme of extracting relevant N-grams of the text and then labelling. Third, SentiWordNet
based publicly available library.
IRJET- Social Network Mental Disorders Detection Via Online Social Media MiningIRJET Journal
This document summarizes a research paper about detecting social network mental disorders via online social media mining. The paper proposes a system to analyze user posts and interactions on social media like Facebook to detect signs of mental disorders like information overload or cyber-relationship addiction. The system uses techniques like sentiment analysis of posts using CNNs and classification of user stress levels using TSVM. If a user is detected as having a mental disorder, the system can recommend nearby hospitals on a map and send the user precautionary information to avoid disorders and promote well-being. The goal is to help detect mental disorders earlier through social media analysis to enable timely clinical intervention.
DYNAMIC LARGE SCALE DATA ON TWITTER USING SENTIMENT ANALYSIS AND TOPIC MODELINGAndry Alamsyah
1. The document presents a case study analyzing tweets about Uber using sentiment analysis and topic modeling to understand public opinion from large-scale social media data.
2. Sentiment analysis classified tweets as positive, negative, or neutral, while topic modeling identified dominant topics of discussion, like promotions or driver complaints.
3. The analyses found that positive tweets often discussed promotions while negative tweets addressed issues like sexual harassment allegations or unsatisfactory drivers.
Political prediction analysis using text mining and deep learningVishwambhar Deshpande
We have proposed a system to determine current sentiment on twitter using Twit-
ter API for open access which includes opinions from dierent content structures like
latest news, audits, articles and social media posts. and Deep Learning method to
study Historic Data for predicting future results. we utilized Naive Bayes and dictio-
nary based algorithms to predict the sentiment on Live Twitter Data.
Framework for opinion as a service on review data of customer using semantics...IJECEIAES
At opinion mining plays a significant role in representing the original and unbiased perception of the products/services. However, there are various challenges associated with performing an effective opinion mining in the present era of distributed computing system with dynamic behaviour of users. Existing approaches is more laborious towards extracting knowledge from the reviews of user which is further subjected to various rounds of operation with complex procedures. The proposed system addresses the problem by introducing a novel framework called as opinion-as-a-service which is meant for direct utilization of the extracted knowledge in most user friendly manner. The proposed system introduces a set of three sequential algorithm that performs aggregated of incoming stream of opinion data, performing indexing, followed by applying semantics for extracting knowledge. The study outcome shows that proposed system is better than existing system in mining performance.
Depression and anxiety detection through the Closed-Loop method using DASS-21TELKOMNIKA JOURNAL
The change of information and communication technology has brought many changes in daily
life. The way humans interacting is changing. It is possible to express each form of communication directly
and instantly. Social media has contributed data in size, diversity and capacity and quality. Based on it,
the idea was to see and measure the tendency of depression and anxiety through social media using
the Closed-Loop method using Facebook text mining posts. Through the stages of pre-processing
including text extraction using the Naïve Bayes machine learning model for text classification, the early
signs of depression and anxiety are measured using DASS-21 parameter. In total, 22,934 Facebook posts
were contributed as training and learning data collected from July 2017 until July 2018. As a results,
analysis and mapping of social demographics of users that are usually as a trigger of depression, and
anxiety, such as grief, illness, household affairs, children education and others are available.
DEEP LEARNING SENTIMENT ANALYSIS OF AMAZON.COM REVIEWS AND RATINGSijscai
The document summarizes research on using deep learning techniques for sentiment analysis of Amazon product reviews and ratings. Specifically, it trains recurrent neural networks using paragraph vectors of reviews to learn product embeddings that capture temporal relationships between reviews. This helps identify mismatches between highly positive/negative reviews and low/high ratings. A web service applies the model to reviews and warns users if the predicted sentiment differs from their given rating.
The document describes research into creating realistic dialog between an agent and a user. It introduces a scenario where an agent assists a user in cooking by guiding them through recipe steps and solving problems. The goal is to use BDI concepts like goals, beliefs and status information to structure the recipe procedure and dialog flow. An example agent implementation is created for the cooking scenario to test approaches to the different phases of recipe selection, planning, instruction and finalization. The results show how BDI elements can support realistic multi-turn dialog between an agent and user.
Provide individualized suggestions
of data or products related to users’ needs
by Recommender systems (RSs). Even
if RSs have created substantial progresses
in theory and formula development and
have achieved many business successes, a
way to operate the wide accessible info in
online social Networks (OSNs) has been
mainly overlooked. Noticing such a gap in
the existing research in RSs and taking
into account a user’s choice being greatly
influenced by his/her trustworthy friends
and their opinions; this paper proposes a,
Fact Finder technique that improves the
prevailing recommendation approaches by
exploring a new source of data from
friends’ short posts in microbloggings as
micro-reviews.Degree of friends’
sentiment and level being sure to a user’s
choice are known by victimisation
machine learning strategies as well as
Naive Bayes, Logistic Regression and
Decision Trees. As the verification of the
proposed Fact finder, experiments
victimisation real social data from Twitter
microblogger area unit given and results
show the effectiveness and promising of
the planned approach.
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERINGijaia
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety,
Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
KnowMe and ShareMe: Understanding Automatically Discovered Personality Trai...Wookjae Maeng
There is much recent work on using the digital footprints left by people on social media to predict personal traits and gain a deeper understanding of individuals. Due to the veracity of social media, imperfections in prediction algorithms, and the sensitive nature of one’s personal traits, much research is still needed to better understand the effectiveness of this line of work, including users’ preferences of sharing their com- putationally derived traits. In this paper, we report a two- part study involving 256 participants, which (1) examines the feasibility and effectiveness of automatically deriving three types of personality traits from Twitter, including Big 5 per- sonality, basic human values, and fundamental needs, and (2) investigates users’ opinions of using and sharing these traits. Our findings show there is a potential feasibility of automati- cally deriving one’s personality traits from social media with various factors impacting the accuracy of models. The re- sults also indicate over 61.5% users are willing to share their derived traits in the workplace and that a number of factors significantly influence their sharing preferences. Since our findings demonstrate the feasibility of automatically infer- ring a user’s personal traits from social media, we discuss their implications for designing a new generation of privacy- preserving, hyper-personalized systems.
This document summarizes a research paper that proposes a novel approach for dynamic personalized recommendation. It utilizes information from user ratings and profiles to develop dynamic features that describe user preferences over multiple phases of interest. An adaptive weighting algorithm then makes recommendations by weighting these dynamic features based on the amount of rating data available. The proposed approach was tested on public datasets and performed well for dynamic recommendation compared to existing algorithms.
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET Journal
1. The document proposes a framework called SociRank to identify and rank prevalent news topics using social media factors.
2. SociRank identifies topics prevalent in both social media and news media, and then ranks them based on their media focus in news, user attention in social media, and user interaction regarding the topic.
3. The experiments show that SociRank improves the quality and variety of automatically identified news topics compared to other topic identification and ranking methods.
LAK2011: 1st International Conference on Learning Analytics and Knowledge February 27-March 1, 2011
Banff, Alberta
Anna De Liddo, Simon Buckingham Shum,
Ivana Quinto, Michelle Bachler, Lorella Cannavacciuolo
This document discusses using Twitter data for sentiment analysis and influence tracking. It describes how Twitter data was collected using its APIs and preprocessed by removing links, usernames and stopwords. N-grams and part-of-speech tags were then extracted as features from the tweets. The tweets were classified into positive, negative, neutral or irrelevant categories. Sentiment analysis was performed at the entity level to determine sentiment towards specific topics mentioned in tweets, like products. Influence was tracked using algorithms that rank users based on retweets, followers and mentions.
This document reviews research on predicting personality from Twitter users' tweets using machine learning algorithms. It discusses how tweets have attracted research interest from diverse fields. Different techniques have been used to predict personality from tweets, but there are still shortcomings to address. The aim is to consider the current state of this research area and explore personality prediction from tweets by reviewing past literature and discussing approaches to issues researchers face. It provides an overview of machine learning methods used for personality prediction from tweets, including data collection, preprocessing, model training and evaluation.
A HYBRID CLASSIFICATION ALGORITHM TO CLASSIFY ENGINEERING STUDENTS’ PROBLEMS ...IJDKP
The social networking sites have brought a new horizon for expressing views and opinions of individuals.
Moreover, they provide medium to students to share their sentiments including struggles and joy during the
learning process. Such informal information has a great venue for decision making. The large and growing
scale of information needs automatic classification techniques. Sentiment analysis is one of the automated
techniques to classify large data. The existing predictive sentiment analysis techniques are highly used to
classify reviews on E-commerce sites to provide business intelligence. However, they are not much useful
to draw decisions in education system since they classify the sentiments into merely three pre-set
categories: positive, negative and neutral. Moreover, classifying the students’ sentiments into positive or
negative category does not provide deeper insight into their problems and perks. In this paper, we propose
a novel Hybrid Classification Algorithm to classify engineering students’ sentiments. Unlike traditional
predictive sentiment analysis techniques, the proposed algorithm makes sentiment analysis process
descriptive. Moreover, it classifies engineering students’ perks in addition to problems into several
categories to help future students and education system in decision making.
Big five personality prediction based in Indonesian tweets using machine lea...IJECEIAES
The popularity of social media has drawn the attention of researchers who have conducted cross-disciplinary studies examining the relationship between personality traits and behavior on social media. Most current work focuses on personality prediction analysis of English texts, but Indonesian has received scant attention. Therefore, this research aims to predict user’s personalities based on Indonesian text from social media using machine learning techniques. This paper evaluates several machine learning techniques, including naive Bayes (NB), K-nearest neighbors (KNN), and support vector machine (SVM), based on semantic features including emotion, sentiment, and publicly available Twitter profile. We predict the personality based on the Big Five personality model, the most appropriate model for predicting user personality in social media. We examine the relationships between the semantic features and the Big Five personality dimensions. The experimental results indicate that the Big Five personality exhibit distinct emotional, sentimental, and social characteristics and that SVM outperformed NB and KNN for Indonesian. In addition, we observe several terms in Indonesian that specifically refer to each personality type, each of which has distinct emotional, sentimental, and social features.
There are various online networking sites such as Facebook, twitter where students casually discuss their educational
experiences, their opinions, emotions, and concerns about the learning process. Information from such open environment can
give valuable knowledge for opinions, emotions and help the educational organizations to get insight into students’ educational
life. Analysing down such data, on the other hand, can be challenging therefore a qualitative research and significant data
mining process needs to be done. Sentiment classification can be done using NLP (Natural Language Processing). For a social
network that provides micro blogging services such as twitter, the incoming tweets can be classified into News, Opinions,
Events, Deals and private Messages based on authors information available in the tweets. This approach is similar to
Tweetstand, which classifies the tweets into news and non-news. Even for e-commerce applications virtual customer
environments can be created using social networking sites. Since the data is ever growing, using data mining techniques can get
difficult, hence we can use data analysis tools
A scalable, lexicon based technique for sentiment analysisijfcstjournal
Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased
interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much
social media available on the web, sentiment analysis is now considered as a big data task. Hence the
conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data
available now a days. The main focus of the research was to find such a technique that can efficiently
perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative
and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large
data set of tweets using Hadoop and the performance of the technique was measured in form of speed and
accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big
sentiment data sets.
The impact of sentiment analysis from user on Facebook to enhanced the servic...IJECEIAES
The document summarizes a study that analyzed sentiment from 600 user comments on Facebook to understand how user sentiment impacts Facebook's service quality. The comments were collected from three Facebook posts and analyzed using sentiment analysis tools. The results found 41.50% of comments were negative, 22.83% were neutral, and 35.67% were positive. The study aims to help Facebook understand user sentiment and perceptions to improve their services.
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...IJECEIAES
The research-based implementations towards Sentiment analyses are about a decade old and have introduced many significant algorithms, techniques, and framework towards enhancing its performance. The applicability of sentiment analysis towards business and the political survey is quite immense. However, we strongly feel that existing progress in research towards Sentiment Analysis is not at par with the demand of massively increasing dynamic data over the pervasive environment. The degree of problems associated with opinion mining over such forms of data has been less addressed, and still, it leaves the certain major scope of research. This paper will brief about existing research trends, some important research implementation in recent times, and exploring some major open issues about sentiment analysis. We believe that this manuscript will give a progress report with the snapshot of effectiveness borne by the research techniques towards sentiment analysis to further assist the upcoming researcher to identify and pave their research work in a perfect direction towards considering research gap.
Abstract
Background: Indonesia is an active Twitter user that is the largest ranked in the world.
Tweets written by Twitter users vary, from tweets containing positive to negative responses.
This agreement will be utilized by the parties concerned for evaluation.
Objective: On public comments there are emoticons and sarcasm which have an influence on
the process of sentiment analysis. Emoticons are considered to make it easier for someone to
express their feelings but not a few are also other opinion researchers, namely by ignoring
emoticons, the reason being that it can interfere with the sentiment analysis process, while
sarcasm is considered to be produced from the results of the sarcasm sentiment analysis in it.
Methods: The emoticon and no emoticon categories will be tested with the same testing data
using classification method are Naïve Bayes Classifier and Support Vector Machine. Sarcasm
data will be proposed using the Random Forest Classifier, Naïve Bayes Classifier and
Support Vector Machine method.
Results: The use of emoticon with sarcasm detection can increase the accuracy value in the
sentiment analysis process using Naïve Bayes Classifier method.
Conclusion: Based on the results, the amount of data greatly affects the value of accuracy.
The use of emoticons is excellent in the sentiment analysis process. The detection of superior
sarcasm only by using the Naïve Bayes Classifier method due to differences in the amount
of sarcasm data and not sarcasm in the research process
A sentiment analysis model of agritech startup on Facebook comments using na...IJECEIAES
Facebook page is a tool able to generate perceptions and acceptance, and support people and investors in making business decisions. Moreover, Facebook page plays a part in engaging people in the form of a community. People share experiences and opinions toward products, services, and trends in particular periods on the Facebook page community. Regarding sentiment analysis on Facebook pages, most education and other general topics in English have only been analyzed in English. However, sentiment analysis regarding agritech startups topics in Thai language has not been done yet. This study analyzes opinions and categorizes positive and negative comments by using naive Bayes classifier to examine the sentiments and attitudes of people and investors. The results could possibly reflect the perception rate of agritech startups in Thailand and could be applied to explain attentiveness and assess people’s engagement opinions. Furthermore, it could be applied in studying consumer behavior, marketing analysis, spread of information, and attitudes. The study's model is generic and could be applied in other contexts to provide insightful suggestions.
IRJET- Interpreting Public Sentiments Variation by using FB-LDA TechniqueIRJET Journal
This document discusses sentiment analysis techniques for classifying tweets based on their positive, negative, or neutral sentiment. It proposes two Latent Dirichlet Allocation (LDA) based models - Foreground and Background LDA (FB-LDA) and Reason Candidate and Background LDA (RCB-LDA) - to analyze sentiment variation in tweets. FB-LDA can filter background topics and extract foreground topics to identify possible explanations for sentiment changes. RCB-LDA can rank reason candidates expressed in tweets to provide sentence-level sentiment explanations. The proposed techniques are intended to classify tweets and evaluate public sentiment variations by extracting possible reasons for those variations.
This document provides guidance on evaluating community engagement. It discusses:
- The importance of determining evaluation readiness and capacity before beginning an evaluation.
- Choosing the appropriate type of evaluation (e.g. process, outcome, or impact evaluation) based on the goals and scale of the engagement work.
- A four-step process for planning an evaluation: 1) developing a logic model, 2) seeking focus, 3) identifying measures, and 4) identifying data needs.
- Common data collection methods like surveys, interviews, and focus groups, as well as tips for design surveys and analyzing both quantitative and qualitative data.
The document aims to offer practical, hands-on guidance for community groups
This document discusses using sentiment analysis on social media data to extract useful information for businesses and customers. It proposes a methodology that uses three modules: an extractor to access social media APIs and obtain raw data, a preprocessor to clean the raw data, and an analyzer using naive Bayes classification to categorize the preprocessed data into positive, negative, or neutral sentiments. The categorized sentiment data can then be used by businesses for decision making and by customers to inform their purchasing decisions. The methodology is demonstrated by implementing sentiment analysis on tweets from Twitter.
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...TELKOMNIKA JOURNAL
This document summarizes a research paper that proposed improving sentiment analysis of short informal Indonesian product reviews using synonym-based feature expansion. The paper developed an automatic sentiment analysis system using Naive Bayes classification and feature expansion. It first preprocessed texts through normalization, then used an API to find synonyms and expand text features. Experiments showed the proposed method improved sentiment analysis accuracy of short reviews to 98%, and that feature expansion helped more with small training datasets. The best performance was with 400 training examples using expansion.
IRJET- Predicting Academic Performance based on Social ActivitiesIRJET Journal
This document discusses predicting student academic performance based on their social media activities in an online learning environment. It presents a study of 343 students in a computer science course that used social tools like wikis, blogs, and microblogging for collaboration. The study collected data on student activities and used regression algorithms, including a novel Large Margin Nearest Neighbor Regression approach, to predict student grades based on their social media usage. The models achieved good prediction accuracy, outperforming other common regression algorithms.
Hate Speech Recognition System through NLP and Deep LearningIRJET Journal
The document describes a proposed system for recognizing hate speech through natural language processing and deep learning techniques. It discusses how hate speech on social media platforms is a growing problem. The proposed system uses techniques like TF-IDF, entropy estimation, and a fuzzy artificial neural network for hate speech recognition. The system preprocesses text data by removing special symbols, applying stemming, and removing stop words. It then classifies text as hate speech or not hate speech using the natural language processing and deep learning models. The authors conducted experiments that showed the system achieved highly positive results in hate speech recognition performance.
This document summarizes a research paper that proposes using a logistic regression classifier trained with stochastic gradient descent to predict Twitter users' personalities from their tweets. It begins with an abstract of the paper and an introduction on personality prediction from social media. It then provides more detail on the anatomy of the research, including defining personality prediction from Twitter, its applications, and the general process of using machine learning for the task. Next, it reviews several previous studies on personality prediction from Twitter and social networks, noting their approaches, findings and limitations. It identifies remaining research gaps, such as the need for improved linguistic analysis of tweets and more robust/scalable predictive models. Finally, it proposes using a logistic regression classifier as the personality prediction model to address
IRJET- Sentimental Analysis of Twitter Data for Job OpportunitiesIRJET Journal
This document discusses sentiment analysis of tweets related to job opportunities. It begins with an introduction to sentiment analysis and its applications. It then discusses how Twitter is a rich source of data for sentiment analysis due to the large number of daily posts, but that analyzing sentiment in tweets is challenging due to their short length and use of abbreviations. The document then outlines the design and implementation of the sentiment analysis, which involves downloading tweets and sentiment dictionaries, cleaning the tweet data by removing stop words and tokenizing, comparing words to dictionaries to determine sentiment scores, and classifying tweets as positive, negative or neutral based on the scores.
IRJET - Sentiment Analysis for Marketing and Product Review using a Hybrid Ap...IRJET Journal
This document presents a hybrid approach for sentiment analysis that combines a lexicon-based technique and a machine learning technique using recurrent neural networks. It aims to analyze sentiments expressed in tweets towards products and services more accurately. The proposed model first cleans tweets collected from Twitter APIs. It then classifies the tweets' sentiment using both a lexicon-based technique using TextBlob and an LSTM-RNN model. The hybrid approach provides not only classification of sentiment but also a score of sentiment strength. This combined approach seeks to gain deeper insights than single techniques alone.
This document discusses sentiment analysis of online product reviews. It begins with an abstract that outlines how sentiment analysis can be used to analyze online reviews, comments, and opinions shared on social media. It then provides background on how sentiment analysis works at different levels of granularity like the document, sentence, phrase and feature levels. The document also discusses related work where other researchers have used techniques like supervised learning methods and support vector machines to perform sentiment analysis on product reviews from sites like traveler review sites. It outlines how sentiment analysis can help companies understand customer opinions and feedback.
Similar to Sentiment Mining of Community Development Program Evaluation Based on Social Media (20)
Amazon products reviews classification based on machine learning, deep learni...TELKOMNIKA JOURNAL
In recent times, the trend of online shopping through e-commerce stores and websites has grown to a huge extent. Whenever a product is purchased on an e-commerce platform, people leave their reviews about the product. These reviews are very helpful for the store owners and the product’s manufacturers for the betterment of their work process as well as product quality. An automated system is proposed in this work that operates on two datasets D1 and D2 obtained from Amazon. After certain preprocessing steps, N-gram and word embedding-based features are extracted using term frequency-inverse document frequency (TF-IDF), bag of words (BoW) and global vectors (GloVe), and Word2vec, respectively. Four machine learning (ML) models support vector machines (SVM), logistic regression (RF), logistic regression (LR), multinomial Naïve Bayes (MNB), two deep learning (DL) models convolutional neural network (CNN), long-short term memory (LSTM), and standalone bidirectional encoder representations (BERT) are used to classify reviews as either positive or negative. The results obtained by the standard ML, DL models and BERT are evaluated using certain performance evaluation measures. BERT turns out to be the best-performing model in the case of D1 with an accuracy of 90% on features derived by word embedding models while the CNN provides the best accuracy of 97% upon word embedding features in the case of D2. The proposed model shows better overall performance on D2 as compared to D1.
Design, simulation, and analysis of microstrip patch antenna for wireless app...TELKOMNIKA JOURNAL
In this study, a microstrip patch antenna that works at 3.6 GHz was built and tested to see how well it works. In this work, Rogers RT/Duroid 5880 has been used as the substrate material, with a dielectric permittivity of 2.2 and a thickness of 0.3451 mm; it serves as the base for the examined antenna. The computer simulation technology (CST) studio suite is utilized to show the recommended antenna design. The goal of this study was to get a more extensive transmission capacity, a lower voltage standing wave ratio (VSWR), and a lower return loss, but the main goal was to get a higher gain, directivity, and efficiency. After simulation, the return loss, gain, directivity, bandwidth, and efficiency of the supplied antenna are found to be -17.626 dB, 9.671 dBi, 9.924 dBi, 0.2 GHz, and 97.45%, respectively. Besides, the recreation uncovered that the transfer speed side-lobe level at phi was much better than those of the earlier works, at -28.8 dB, respectively. Thus, it makes a solid contender for remote innovation and more robust communication.
Design and simulation an optimal enhanced PI controller for congestion avoida...TELKOMNIKA JOURNAL
This document describes using a snake optimization algorithm to tune the gains of an enhanced proportional-integral controller for congestion avoidance in a TCP/AQM system. The controller aims to maintain a stable and desired queue size without noise or transmission problems. A linearized model of the TCP/AQM system is presented. An enhanced PI controller combining nonlinear gain and original PI gains is proposed. The snake optimization algorithm is then used to tune the parameters of the enhanced PI controller to achieve optimal system performance and response. Simulation results are discussed showing the proposed controller provides a stable and robust behavior for congestion control.
Improving the detection of intrusion in vehicular ad-hoc networks with modifi...TELKOMNIKA JOURNAL
Vehicular ad-hoc networks (VANETs) are wireless-equipped vehicles that form networks along the road. The security of this network has been a major challenge. The identity-based cryptosystem (IBC) previously used to secure the networks suffers from membership authentication security features. This paper focuses on improving the detection of intruders in VANETs with a modified identity-based cryptosystem (MIBC). The MIBC is developed using a non-singular elliptic curve with Lagrange interpolation. The public key of vehicles and roadside units on the network are derived from number plates and location identification numbers, respectively. Pseudo-identities are used to mask the real identity of users to preserve their privacy. The membership authentication mechanism ensures that only valid and authenticated members of the network are allowed to join the network. The performance of the MIBC is evaluated using intrusion detection ratio (IDR) and computation time (CT) and then validated with the existing IBC. The result obtained shows that the MIBC recorded an IDR of 99.3% against 94.3% obtained for the existing identity-based cryptosystem (EIBC) for 140 unregistered vehicles attempting to intrude on the network. The MIBC shows lower CT values of 1.17 ms against 1.70 ms for EIBC. The MIBC can be used to improve the security of VANETs.
Conceptual model of internet banking adoption with perceived risk and trust f...TELKOMNIKA JOURNAL
Understanding the primary factors of internet banking (IB) acceptance is critical for both banks and users; nevertheless, our knowledge of the role of users’ perceived risk and trust in IB adoption is limited. As a result, we develop a conceptual model by incorporating perceived risk and trust into the technology acceptance model (TAM) theory toward the IB. The proper research emphasized that the most essential component in explaining IB adoption behavior is behavioral intention to use IB adoption. TAM is helpful for figuring out how elements that affect IB adoption are connected to one another. According to previous literature on IB and the use of such technology in Iraq, one has to choose a theoretical foundation that may justify the acceptance of IB from the customer’s perspective. The conceptual model was therefore constructed using the TAM as a foundation. Furthermore, perceived risk and trust were added to the TAM dimensions as external factors. The key objective of this work was to extend the TAM to construct a conceptual model for IB adoption and to get sufficient theoretical support from the existing literature for the essential elements and their relationships in order to unearth new insights about factors responsible for IB adoption.
Efficient combined fuzzy logic and LMS algorithm for smart antennaTELKOMNIKA JOURNAL
The smart antennas are broadly used in wireless communication. The least mean square (LMS) algorithm is a procedure that is concerned in controlling the smart antenna pattern to accommodate specified requirements such as steering the beam toward the desired signal, in addition to placing the deep nulls in the direction of unwanted signals. The conventional LMS (C-LMS) has some drawbacks like slow convergence speed besides high steady state fluctuation error. To overcome these shortcomings, the present paper adopts an adaptive fuzzy control step size least mean square (FC-LMS) algorithm to adjust its step size. Computer simulation outcomes illustrate that the given model has fast convergence rate as well as low mean square error steady state.
Design and implementation of a LoRa-based system for warning of forest fireTELKOMNIKA JOURNAL
This paper presents the design and implementation of a forest fire monitoring and warning system based on long range (LoRa) technology, a novel ultra-low power consumption and long-range wireless communication technology for remote sensing applications. The proposed system includes a wireless sensor network that records environmental parameters such as temperature, humidity, wind speed, and carbon dioxide (CO2) concentration in the air, as well as taking infrared photos.The data collected at each sensor node will be transmitted to the gateway via LoRa wireless transmission. Data will be collected, processed, and uploaded to a cloud database at the gateway. An Android smartphone application that allows anyone to easily view the recorded data has been developed. When a fire is detected, the system will sound a siren and send a warning message to the responsible personnel, instructing them to take appropriate action. Experiments in Tram Chim Park, Vietnam, have been conducted to verify and evaluate the operation of the system.
Wavelet-based sensing technique in cognitive radio networkTELKOMNIKA JOURNAL
Cognitive radio is a smart radio that can change its transmitter parameter based on interaction with the environment in which it operates. The demand for frequency spectrum is growing due to a big data issue as many Internet of Things (IoT) devices are in the network. Based on previous research, most frequency spectrum was used, but some spectrums were not used, called spectrum hole. Energy detection is one of the spectrum sensing methods that has been frequently used since it is easy to use and does not require license users to have any prior signal understanding. But this technique is incapable of detecting at low signal-to-noise ratio (SNR) levels. Therefore, the wavelet-based sensing is proposed to overcome this issue and detect spectrum holes. The main objective of this work is to evaluate the performance of wavelet-based sensing and compare it with the energy detection technique. The findings show that the percentage of detection in wavelet-based sensing is 83% higher than energy detection performance. This result indicates that the wavelet-based sensing has higher precision in detection and the interference towards primary user can be decreased.
A novel compact dual-band bandstop filter with enhanced rejection bandsTELKOMNIKA JOURNAL
In this paper, we present the design of a new wide dual-band bandstop filter (DBBSF) using nonuniform transmission lines. The method used to design this filter is to replace conventional uniform transmission lines with nonuniform lines governed by a truncated Fourier series. Based on how impedances are profiled in the proposed DBBSF structure, the fractional bandwidths of the two 10 dB-down rejection bands are widened to 39.72% and 52.63%, respectively, and the physical size has been reduced compared to that of the filter with the uniform transmission lines. The results of the electromagnetic (EM) simulation support the obtained analytical response and show an improved frequency behavior.
Deep learning approach to DDoS attack with imbalanced data at the application...TELKOMNIKA JOURNAL
A distributed denial of service (DDoS) attack is where one or more computers attack or target a server computer, by flooding internet traffic to the server. As a result, the server cannot be accessed by legitimate users. A result of this attack causes enormous losses for a company because it can reduce the level of user trust, and reduce the company’s reputation to lose customers due to downtime. One of the services at the application layer that can be accessed by users is a web-based lightweight directory access protocol (LDAP) service that can provide safe and easy services to access directory applications. We used a deep learning approach to detect DDoS attacks on the CICDDoS 2019 dataset on a complex computer network at the application layer to get fast and accurate results for dealing with unbalanced data. Based on the results obtained, it is observed that DDoS attack detection using a deep learning approach on imbalanced data performs better when implemented using synthetic minority oversampling technique (SMOTE) method for binary classes. On the other hand, the proposed deep learning approach performs better for detecting DDoS attacks in multiclass when implemented using the adaptive synthetic (ADASYN) method.
The appearance of uncertainties and disturbances often effects the characteristics of either linear or nonlinear systems. Plus, the stabilization process may be deteriorated thus incurring a catastrophic effect to the system performance. As such, this manuscript addresses the concept of matching condition for the systems that are suffering from miss-match uncertainties and exogeneous disturbances. The perturbation towards the system at hand is assumed to be known and unbounded. To reach this outcome, uncertainties and their classifications are reviewed thoroughly. The structural matching condition is proposed and tabulated in the proposition 1. Two types of mathematical expressions are presented to distinguish the system with matched uncertainty and the system with miss-matched uncertainty. Lastly, two-dimensional numerical expressions are provided to practice the proposed proposition. The outcome shows that matching condition has the ability to change the system to a design-friendly model for asymptotic stabilization.
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...TELKOMNIKA JOURNAL
Many systems, including digital signal processors, finite impulse response (FIR) filters, application-specific integrated circuits, and microprocessors, use multipliers. The demand for low power multipliers is gradually rising day by day in the current technological trend. In this study, we describe a 4×4 Wallace multiplier based on a carry select adder (CSA) that uses less power and has a better power delay product than existing multipliers. HSPICE tool at 16 nm technology is used to simulate the results. In comparison to the traditional CSA-based multiplier, which has a power consumption of 1.7 µW and power delay product (PDP) of 57.3 fJ, the results demonstrate that the Wallace multiplier design employing CSA with first zero finding logic (FZF) logic has the lowest power consumption of 1.4 µW and PDP of 27.5 fJ.
Evaluation of the weighted-overlap add model with massive MIMO in a 5G systemTELKOMNIKA JOURNAL
The flaw in 5G orthogonal frequency division multiplexing (OFDM) becomes apparent in high-speed situations. Because the doppler effect causes frequency shifts, the orthogonality of OFDM subcarriers is broken, lowering both their bit error rate (BER) and throughput output. As part of this research, we use a novel design that combines massive multiple input multiple output (MIMO) and weighted overlap and add (WOLA) to improve the performance of 5G systems. To determine which design is superior, throughput and BER are calculated for both the proposed design and OFDM. The results of the improved system show a massive improvement in performance ver the conventional system and significant improvements with massive MIMO, including the best throughput and BER. When compared to conventional systems, the improved system has a throughput that is around 22% higher and the best performance in terms of BER, but it still has around 25% less error than OFDM.
Reflector antenna design in different frequencies using frequency selective s...TELKOMNIKA JOURNAL
In this study, it is aimed to obtain two different asymmetric radiation patterns obtained from antennas in the shape of the cross-section of a parabolic reflector (fan blade type antennas) and antennas with cosecant-square radiation characteristics at two different frequencies from a single antenna. For this purpose, firstly, a fan blade type antenna design will be made, and then the reflective surface of this antenna will be completed to the shape of the reflective surface of the antenna with the cosecant-square radiation characteristic with the frequency selective surface designed to provide the characteristics suitable for the purpose. The frequency selective surface designed and it provides the perfect transmission as possible at 4 GHz operating frequency, while it will act as a band-quenching filter for electromagnetic waves at 5 GHz operating frequency and will be a reflective surface. Thanks to this frequency selective surface to be used as a reflective surface in the antenna, a fan blade type radiation characteristic at 4 GHz operating frequency will be obtained, while a cosecant-square radiation characteristic at 5 GHz operating frequency will be obtained.
Reagentless iron detection in water based on unclad fiber optical sensorTELKOMNIKA JOURNAL
A simple and low-cost fiber based optical sensor for iron detection is demonstrated in this paper. The sensor head consist of an unclad optical fiber with the unclad length of 1 cm and it has a straight structure. Results obtained shows a linear relationship between the output light intensity and iron concentration, illustrating the functionality of this iron optical sensor. Based on the experimental results, the sensitivity and linearity are achieved at 0.0328/ppm and 0.9824 respectively at the wavelength of 690 nm. With the same wavelength, other performance parameters are also studied. Resolution and limit of detection (LOD) are found to be 0.3049 ppm and 0.0755 ppm correspondingly. This iron sensor is advantageous in that it does not require any reagent for detection, enabling it to be simpler and cost-effective in the implementation of the iron sensing.
Impact of CuS counter electrode calcination temperature on quantum dot sensit...TELKOMNIKA JOURNAL
In place of the commercial Pt electrode used in quantum sensitized solar cells, the low-cost CuS cathode is created using electrophoresis. High resolution scanning electron microscopy and X-ray diffraction were used to analyze the structure and morphology of structural cubic samples with diameters ranging from 40 nm to 200 nm. The conversion efficiency of solar cells is significantly impacted by the calcination temperatures of cathodes at 100 °C, 120 °C, 150 °C, and 180 °C under vacuum. The fluorine doped tin oxide (FTO)/CuS cathode electrode reached a maximum efficiency of 3.89% when it was calcined at 120 °C. Compared to other temperature combinations, CuS nanoparticles crystallize at 120 °C, which lowers resistance while increasing electron lifetime.
In place of the commercial Pt electrode used in quantum sensitized solar cells, the low-cost CuS cathode is created using electrophoresis. High resolution scanning electron microscopy and X-ray diffraction were used to analyze the structure and morphology of structural cubic samples with diameters ranging from 40 nm to 200 nm. The conversion efficiency of solar cells is significantly impacted by the calcination temperatures of cathodes at 100 °C, 120 °C, 150 °C, and 180 °C under vacuum. The fluorine doped tin oxide (FTO)/CuS cathode electrode reached a maximum efficiency of 3.89% when it was calcined at 120 °C. Compared to other temperature combinations, CuS nanoparticles crystallize at 120 °C, which lowers resistance while increasing electron lifetime.
A progressive learning for structural tolerance online sequential extreme lea...TELKOMNIKA JOURNAL
This article discusses the progressive learning for structural tolerance online sequential extreme learning machine (PSTOS-ELM). PSTOS-ELM can save robust accuracy while updating the new data and the new class data on the online training situation. The robustness accuracy arises from using the householder block exact QR decomposition recursive least squares (HBQRD-RLS) of the PSTOS-ELM. This method is suitable for applications that have data streaming and often have new class data. Our experiment compares the PSTOS-ELM accuracy and accuracy robustness while data is updating with the batch-extreme learning machine (ELM) and structural tolerance online sequential extreme learning machine (STOS-ELM) that both must retrain the data in a new class data case. The experimental results show that PSTOS-ELM has accuracy and robustness comparable to ELM and STOS-ELM while also can update new class data immediately.
Electroencephalography-based brain-computer interface using neural networksTELKOMNIKA JOURNAL
This study aimed to develop a brain-computer interface that can control an electric wheelchair using electroencephalography (EEG) signals. First, we used the Mind Wave Mobile 2 device to capture raw EEG signals from the surface of the scalp. The signals were transformed into the frequency domain using fast Fourier transform (FFT) and filtered to monitor changes in attention and relaxation. Next, we performed time and frequency domain analyses to identify features for five eye gestures: opened, closed, blink per second, double blink, and lookup. The base state was the opened-eyes gesture, and we compared the features of the remaining four action gestures to the base state to identify potential gestures. We then built a multilayer neural network to classify these features into five signals that control the wheelchair’s movement. Finally, we designed an experimental wheelchair system to test the effectiveness of the proposed approach. The results demonstrate that the EEG classification was highly accurate and computationally efficient. Moreover, the average performance of the brain-controlled wheelchair system was over 75% across different individuals, which suggests the feasibility of this approach.
Adaptive segmentation algorithm based on level set model in medical imagingTELKOMNIKA JOURNAL
For image segmentation, level set models are frequently employed. It offer best solution to overcome the main limitations of deformable parametric models. However, the challenge when applying those models in medical images stills deal with removing blurs in image edges which directly affects the edge indicator function, leads to not adaptively segmenting images and causes a wrong analysis of pathologies wich prevents to conclude a correct diagnosis. To overcome such issues, an effective process is suggested by simultaneously modelling and solving systems’ two-dimensional partial differential equations (PDE). The first PDE equation allows restoration using Euler’s equation similar to an anisotropic smoothing based on a regularized Perona and Malik filter that eliminates noise while preserving edge information in accordance with detected contours in the second equation that segments the image based on the first equation solutions. This approach allows developing a new algorithm which overcome the studied model drawbacks. Results of the proposed method give clear segments that can be applied to any application. Experiments on many medical images in particular blurry images with high information losses, demonstrate that the developed approach produces superior segmentation results in terms of quantity and quality compared to other models already presented in previeous works.
Automatic channel selection using shuffled frog leaping algorithm for EEG bas...TELKOMNIKA JOURNAL
Drug addiction is a complex neurobiological disorder that necessitates comprehensive treatment of both the body and mind. It is categorized as a brain disorder due to its impact on the brain. Various methods such as electroencephalography (EEG), functional magnetic resonance imaging (FMRI), and magnetoencephalography (MEG) can capture brain activities and structures. EEG signals provide valuable insights into neurological disorders, including drug addiction. Accurate classification of drug addiction from EEG signals relies on appropriate features and channel selection. Choosing the right EEG channels is essential to reduce computational costs and mitigate the risk of overfitting associated with using all available channels. To address the challenge of optimal channel selection in addiction detection from EEG signals, this work employs the shuffled frog leaping algorithm (SFLA). SFLA facilitates the selection of appropriate channels, leading to improved accuracy. Wavelet features extracted from the selected input channel signals are then analyzed using various machine learning classifiers to detect addiction. Experimental results indicate that after selecting features from the appropriate channels, classification accuracy significantly increased across all classifiers. Particularly, the multi-layer perceptron (MLP) classifier combined with SFLA demonstrated a remarkable accuracy improvement of 15.78% while reducing time complexity.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
The CBC machine is a common diagnostic tool used by doctors to measure a patient's red blood cell count, white blood cell count and platelet count. The machine uses a small sample of the patient's blood, which is then placed into special tubes and analyzed. The results of the analysis are then displayed on a screen for the doctor to review. The CBC machine is an important tool for diagnosing various conditions, such as anemia, infection and leukemia. It can also help to monitor a patient's response to treatment.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
artificial intelligence and data science contents.pptxGauravCar
What is artificial intelligence? Artificial intelligence is the ability of a computer or computer-controlled robot to perform tasks that are commonly associated with the intellectual processes characteristic of humans, such as the ability to reason.
› ...
Artificial intelligence (AI) | Definitio
2. TELKOMNIKA ISSN: 1693-6930
Sentiment Mining of Community Development Program Evaluation Based on … (Siti Yuliyanti)
1859
member perspective, and sway public sentiments and emotions, which has affected profoundly
on overall social impact of the program development. It is how sentiment analysis process as a
computational study of people’s opinions, sentiments, emotions, and attitudes works well within
idea transformation during community development processes. This fascinating problem is
increasingly important in building society capacity. It offers numerous research challenges but
promises insight useful to anyone interested in opinion analysis and social media analysis [7].
For the purpose of evaluation, a collection of tweets was set. These tweets were
extracted and processed for information such as sentiment in dedicated tweets of community
development issues. Sentiment words can be divided into two types, base type and comparative
type. All the preceding example words are of the base type. Sentiment words of the comparative
type (which include the superlative type) are used to express comparative and superlative
opinions. Examples of such words include better, worse, best, worst, and alikes, which are
comparative and superlative forms of the base adjectives or adverbs such as good and bad.
Sentiment analysis of tweets is used to find out whether a tweet consists of positive, negative or
neutral sentiment.There are two kinds of learning that usually used in the process of sentiment
analysis, which is supervised learning and unsupervised learning.
Several researchers have studied how to adapt a general sentiment lexicon to a
particular domain. The machine learning method belongs to supervise learning; this method
usually needs many training data that have been labeled manually. Without labeling the training
data, supervise learning will disable to be processed [7]. The lexicon-based method is belong to
unsupervise learning, which does not need training data and only depend on the dictionary that
is used. Both methods have different characteristics, but it can complement if both methods are
combined. Regarding to the techniques used in a sentiment analysis, there are two major
techniques commonly used; those are machine learning based and lexicon based [8].
Supervised machines learning techniques that are commonly used for this purpose including
support vector machine (SVM) [9] and Naïve Bayes [10].
The combination of both methods can be done by using lexicon-based method to
create label tweets which can be used as training data in SVM method so there will be no
manually labeling process in this combination method [11] [12]. SVM proved to provide a good
classification result in sentiment mining, the implemented practically SVM is often far from the
expected level theoretically because their implementations are based on the approximated
algorithms due to the high complexity of time and space. Improving the limited
performanceclassifications of the real SVM. PCA is deployed to decrease the complexity of an
SVM-based sentiment classification task by applying the concept of reducing the data
dimensionality.
Contrary to several previous research methods, which already investigatedopinion
corpus was written mostly in the English, this work considered a community development
opinion problem written in Bahasa Indonesia. This challenge obviously has different structure
than did in English. By building a model to determine the sentiment of a tweet about the public
responses to the activities of the community development program. Those tweets will be
preprocessing, reduction feature and classify to find out whether tweets consists of positive,
negative or neutral sentiment. This research focuses to activities of community development
programs in the area of Bogor. First, crawling dataset about activities of community
development and the preprocessing that is used in this research: filter, lower case, removal the
stopwords, tokenizing, parsing, labeling sentiment, and weighting terms. Second, the feature
with the lowest value of principal component is reduced using PCA to facilitate the classification
using SVM. Finally, evaluation models using scenario testing on comparisons models of training
data and test data that are used.
2. Research Method
Dataset of this case study was written using Bahasa related to the activities of the
community development program that previously observed directly in the office of Bogor
municipial governmental agency, West Java Indonesia. We collected more than 2000 tweets
from twitter about two prominent youth awareness activities. As shown in Figure 1, the research
framework was divided into 4 parts: those are data collected, pre-processing, classification
sentiment and evaluation. We discussed the details and result in the following section.
3. ISSN: 1693-6930
TELKOMNIKA Vol. 15, No. 4, December 2017 : 1858 - 1864
1860
2.1. Preprocessing
For the purpose of pre-processing on the data set, the following sub processed initiated
by filter, lower case, tokenize, remove the stop-words, weighting and class labeling were
applied. As the pre-processing stages completed and resulted to data that had been cleaned,
and then the later one will be processed in by weighting words, with calculation of term
frequency-inverse document frequency (TF-IDF) and labeling sentiment using lexicon-based
method. Lexicon-based is belong to unsupervised techniques; this method classifies the data
into 2 classes of positive or negative [8], [10]. This lexicon-based method adapts the word-level
polarities of a general-purpose sentiment lexicon for a particular domain by utilizing the
expression-level polarities in the domain. In return, the adapted word-level polarities are used to
improve the expression-level polarities. The word-level and the expression-level polarity
relationships are modeled as a set of constraints and the problem is solved using integer linear
programming. This method is based on the help of dictionary to classify the tweet into positive
sentiment, negative sentiment, or neutral sentiment. There are several steps of lexicon-based
that is used in this research, such as determining the polarity of words, negation handling, and
also giving a score to every each entity in the tweet. The formula to calculate the score for the
entity as seen in the formula (1), based on [7].
0
:
.
score( )=
distance( , )i i i
i
S V i
s
f
f
(1)
where:
score(f)= The final label score of the feature
ωi = An sentiment word
S = All sentiment words
V =sample space feature and sentiment word
so = Label of the sentiment word (+1,0, or -1)
distance (ωi ,f)= Distance between feature (f) and the sentiment words (wi)
Data
Tweet
Stopword
Filter, Case Folding,
Token & Parsing
Removal
Stopword
Cheking class
Lexicon Based
Matriks tweet
Corpus
Negative dan
Positive
Weighting (TF)
Weighting sentiment
Reduce feature
using PCA
Varian = 80%
PC = higher
CM ≤ 1Reduce
feature
no
Classify using SVM
Estimation
parameter (c, y)
yes
Evaluation Models
Model 1 [60% Training data, 40% Testing data]
Model 2 [70% Training data, 30% Testing data]
Model 3 [80% Training data, 20% Testing data]
Model 4 [90% Training data, 10% Testing data]
Twitter
Authentifikasi
key
Indicator
crawling
tweets
Sent tweets
Request
tweets
Collected data
Preprocessing
Save file
Presentase sentiment
Result clasificationi sentiment
Result evaluastion model and
activites
Matrix tweet TF
Figure 1. Research Framework
4. TELKOMNIKA ISSN: 1693-6930
Sentiment Mining of Community Development Program Evaluation Based on … (Siti Yuliyanti)
1861
In general, calculation of TF (Term Frequency) is the calculation of the number of times
a word against the tweet. It is to show how important a word to a tweet that there is a collection
of tweets [17]. Results of phases D and E used as a vector W. where W={w1, w2, …, wi} and i ε
s contain the word candidates sentiment and W ε V with V is a corpus that contains features
and word sentiment. This step gives the class label with lexicon based on each tweet by positive
and negative classes that exist in the Indonesian lexicon corpus. Furthermore, the proximity
value is calculated by using the lexicon corpus Equation 1. If the value is positive or end score
then assumed the features is apositive. Then tweet or end score value is negative, it is assumed
negative features in the tweet grudges, and if not both then tweet including neutral class [7]. TF-
IDF [11] was used to identify how important is every single available term in the corpus. It is also
a common technique to calculate the vector weight based on the semantic relatedness, tft,d, the
frequency of term t in document d, is defined as formula (2). In this work document is a tweet
and used just term frequency.
(2)
2.2. Feature Reduction
The purpose of dimension reduction is the main task of PCA (Principal Component
Analysis) algorithm [15]. In this work, PCA is used as a statistical procedure in exploratory data
analysis and for making predictive opinion models. PCA can be computed by eigenvalue
decomposition of a data covariance (or correlation) matrix or singular value decomposition of a
data matrix, usually after mean centering (and normalizing or using Z-scores) the data matrix for
each attribute.The results of a PCA are usually discussed in terms of component scores,
sometimes called factor scores (the transformed variable values corresponding to a particular
data point), and loadings (the weight by which each standardized original variable should be
multiplied to get the component score).. These principal components are used as a predictor or
criterion variable in other analysis. The variables are orthogonalized by the PCA and principal
components with largest variation are chosen and components with least variation are
eliminated from the dataset. PCA is powerful with its simplicity of the true eigenvector-based
multivariate analyses in which the operation can be thought of as revealing the internal structure
of the data in a way that best explains the variance in the data. If a multivariate dataset is
visualised as a set of coordinates in a high-dimensional data space (1 axis per variable), PCA
can supply the user with a lower-dimensional picture, a projection of this object when viewed
from its most informative viewpoint. Users might interpret the result by using only the first few
principal components so that the dimensionality of the transformed data is reduced.
2.3. Classification Methods
Supervised learning is the machine learning task of learning a function that maps an
input to an output based on example input-output pairs. It infers a function from labeled training
data consisting of a set of training examples. In supervised learning, each example is a pair
consisting of an input object (typically a vector) and a desired output value (also called the
supervisory signal). A supervised learning algorithm analyzes the training data and produces an
inferred function, which can be used for mapping new examples. An optimal scenario will allow
for the algorithm to correctly determine the class labels for unseen instances. This requires the
learning algorithm to generalize from the training data to unseen situations in a "reasonable"
way An SVM model is a representation of the examples as points in space, mapped so that the
examples of the separate categories are divided by a clear gap that is as wide as possible. SVM
are supervised learning models with associated learning algorithms that analyze data used for
classification and regression analysis. Given a set of training examples, each marked as
belonging to one or the other of two categories, an SVM training algorithm builds a model that
assigns new examples to one category or the other, making it a non-probabilistic binary linear
classifier. Supervised Learning in sentiment analysis is a method that trains a sentiment
classifier that is taken based on the frequency of occurrence of various words contained in the
document, text, or tweet [17]. By doing training process that uses the data input in the form of
numerical data such as word index number, and also the weight (usually obtained from the
,
,
argmax( )
t d
t d
d
f
tf
tf
5. ISSN: 1693-6930
TELKOMNIKA Vol. 15, No. 4, December 2017 : 1858 - 1864
1862
calculation of TF, Term Presence, etc.) will result in a value or pattern that will be used in the
testing process for labeling process tweet.
SVM help to find the optimal hyperplane that has a maximum margin and constructs a
hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for
classification, regression, or other tasks like outliers detection. Intuitively, a good separation is
achieved by the hyperplane that has the largest distance to the nearest training-data point of
any class (so-called functional margin), since in general the larger the margin the lower the
generalization error of the classifier. Margin itself is the distance between the hyperplane (lines)
with the closest point from each class, the closest point usually called as support vector. This
section discusses the classification methods used in this research to develop the sentiment
mining models. Classification was compiled byRapidMiner Studio with default values for all
parameter. The steps SVM of conducted in this research are as follows [17] to prepare the input
data in the form of index word, the weight of word, and also its label; to calculate the parameter
weight (w); to. Calculate the bias (b) and to get the classification for data testing the processes
of four steps performed on SVM method in this research using by setting the best parameter
estimation using the grid search. Grid search aims to make the grid parameters of each pair (C,
γ). Parameter values (C, γ) determined in advance by a range of values from 0.1 to 0.9, and
then pair each value of the parameters (C, γ) so the couple parameters that yield the highest
accuracy used in testing scenarios 4 models based on the percentage of training data and test
data. The scenario tests the comparisons of training data and testing data that are used [7]. The
detail of the comparison of both these data are as follows:
a. First model with overall training data 60% and the rest as testing up to 40%
b. Second model with overall training training data 70% and testing data 30%
c. Third nodel with overall training training data 80% and testing data 20%
d. Fourth model with overall training training data 90% and testing data 10%
The testing scenario was gained from the data tweet that has successfully analyzed.
This testing model showed the conclusion on the amount of positives, negatives and neutral
sentiments obtained from each activities of community development program. The conclusion
was only taken from the testing data that previously got the highest accuracy in each activities
of community development program.
3. Results and Analysis
3.1. Preprocessing
Crawling is the process by means of a registration API connection using Twitter
Application Management to get the API Key, API Secret, access token, access token secret and
then performs authentication. Further data collection by keyword with the desired parameters,
for example, in this study keywords used about the activities within one years of January 2015
until January 2016 and then stored in a file with the .csv (comma delimited). After preprocessing
of the dataset that includes: filter, lower case, tokenize, remove the stop-words, weighting and
class labeling.
3.2. Feature Reduction
The reduction process used to find the features of the best features that will be used in
a classification process that is using the PC of the highest value and reducing features that are
considered unfavorable. Features shown are featured with the highest value PC with cumulative
variance ≤ 1 that means that features that do not meet these values are no longer variants.
Based on reduction features, 1219 feature of activities 1 and features 951 feature of activities 2
that will be used in the classification process.
3.3. Classification Performance
In this study, prior to classification by test data, will be estimation parameters on SVM to
find the best parameter to be used for classification are the parameter c and γ. SVM
classification process in the present study using RBF kernel function where the kernel requires
parameter c and γ at the process [18]. During the process to get the best parameter values, it is
conducted several stages on the dataset. The first has done by creating a grid parameter on
each pair of parameter values. Parameter values c and γ predetermined manually with a range
of values each 0.1 to 0.9. Couple grades c and γ the best is that the average value of the most
6. TELKOMNIKA ISSN: 1693-6930
Sentiment Mining of Community Development Program Evaluation Based on … (Siti Yuliyanti)
1863
accuracy, some couples parameter values that provide accuracy best in class classification
sentiment amounted to 97.44% is (c=0.8, γ =0.8), (c=0.8, γ=0.9), (c=0.9, γ=0.8) and (c=0.9,
γ=0.9). Application of SVM algorithm with the addition of a neutral class is expected to produce
a good model with a high degree of accuracy. Illustrations classification process is represented
in Figure 2.
Figure 2. Flow knowledge classification sentiment
4. Evaluation
Doing performance the classification task, the class evaluation was done by using test
scenario use result estimation parameter. It would be easy to understand how good a model in
classification in Table 1. That divides the data into four models: model 1 with 60% of training
data, 40% of test data, model 2 with 70% of training data, 30% of test data, model 3 with 80% of
training data, 20% of test data, and model 4 90% 10% training data test data to determine the
level of accuracy of the model. Table 1 shows that the highest accuracy in the classification by
using the dataset obtained by reduction feature of the Model 3 for the activities of the Activity
1while for activities Activity 1in Model 1. Evaluate the performance of the classification model is
based on three parameters: accuracy, precision, and recall the values indicated means in Table
1, where accuracy is not significantly affected by the division of training data and test data. This
level of accuracy is good enough to compare with previous research [13] and by comparing the
results of classification without reduction feature by PCA.
Reduction feature can improve the accuracy of the classification process and know the
public response to the activities of the community development program through Twitter. The
weakness of this study is not done preprocessing to detect the language of 'Alay' in a tweet, has
not presented a sentiment classification spatially group activities and social media are used in
the extraction only the Twitter dataset.
Table 1. Accuracy of Classification Sentiment using Parameter c=0.8 and γ=0.8
Model Acitivity 1 Acitivity 2
First Model (60% data training; 40% data testing) 82.78 88.64
Second Model (70% data training; 30% data testing) 79.49 85.35
Third Model (80% data training; 20% data testing) 78.75 85.71
Fourth Model (90% data training; 10% data testing) 78.39 83.15
Mean79.85 85.71
6. Conclusion
Sentiment mining models are built which capable for extraction textual data into
structure so that produce sentiment and classified to determine the public response to the
activities in community development programs. Data collects on the crawling tweets in 1000
tweets of each activity, after preprocessing feature obtained in 1219 and 1302 features and
reduced feature after feature into 1156 and features 951. The couples parameter values that
provide best accuracy in class classification sentiment is amounted to 97.44% is (c=0.8, γ=0.8),
7. ISSN: 1693-6930
TELKOMNIKA Vol. 15, No. 4, December 2017 : 1858 - 1864
1864
(c=0.8, γ=0.9), (c=0.9, γ=0.8) and (c=0.9, γ=0.9). The accuracy of the resulting data is tweet by
reduction features highest Model 1 with 60% of training data and 40% of test data on the Activity
2, namely an accuracy of 88.64% and 82.78% of activity 1. The level of accuracy of the model
affected SVM parameter estimation and preprocessing, but not affected the distribution of test
data and training data. The evaluation program is a Activity 2 have a level of information spread
with the best positive sentiment, and then Activity 1 should be increased of spread information
and dissemination program of activities.
References
[1] Pang B, Lee L. Opinion Mining and Sentiments Analysis. Foundations and TrendsR in Information
Retrieval. 2008; 2(1-2): 1-135.
[2] Rahman R. Corporate Social Responsibility. Yogyakarta: Media Pressindo. 2009.
[3] Adi IR. Empowerment, Community Development, dan Intervensi Komunitas: Seri Pemberdayaan
Masyarakat 03. Publisher Institution Faculty of Economics University Indonesia. 2003. ISBN: 979-
9242-44-5.
[4] Hemalatha I, Varma PG, Govardhan A. Preprocessing the Informal Text for Efficient Sentiment
Analysis. IJETTCS. 2012; 1. ISSN: 2278-6856.
[5] Ho C, Pong L. Interpreting TF-IDF Term Weights as Making Relevance Decision. ACM. 2008.
[6] Philips R, Pittman RH. An Introduction to Community Development. ISBN: 0-203-88693-3. First
published by Routledge, USA and Canada. 2009.
[7] Tiara, Sabariah MK, Effendy M. Sentiment Analysis on Twitter Using the Combination of Lexicon-
Based and Support Vector Machine for Assessing the Performance of a Television Program. ICoICT.
2015.
[8] Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: A survey. Ain
Shams Engineering Journal. 2014: 1093-1113.
[9] Xu K, Shaoyi S, Li J, Yuxia S. Mining comparative opinions from customer reviews for Competitive
Intelligence. Decision Support Systems. 2011; 50: 743-754.
[10] Li N, Wu DD. Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection.
Decision Support Systems. 2010; 48: 354-368.
[11] Tan S, Wang Y, Cheng X. Combining Learn Based and Lexicon-Based Techniques for Sentiment
Detection without using Labeled Examples. In Proceedings of the 31st annual international ACM
SIGIR conference on Research and Development in Information Retrieval.2008. Singapore.
[12] Pang B, Lee L, Vithyanathan S. Thumbs Up? Sentiment Classification using Machine Learning
Techniques. Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language
Processing (pp. 79-86). 2002. Stroudsburg: Association for Computational Linguistic.
[13] Subramanian KM, Venkatachalam K. Framework for Evaluating Camera Opinions. Research Journal
of Applied Sciences, Engineering and Technology. 2015; (7): 519-525. ISSN: 2040-7459; e-ISSN:
2040-7467.
[14] Wahyudin I, Djatna T. Cluster Analysis for SME Risk Analysis Documents Based on Pillar K-Means.
TELKOMNIKA. 2016; 14(2): 674~683. ISSN: 1693-6930.
[15] Jotheeswaran J, Loganathan R, Madhu SB. Feature Reduction using Principal Component Analysis
for Opinion Mining. IJCST. 2012; 3(5): 118-121. ISSN 2047-3338.
[16] Vinodhini G, Chandrasekaran RM. Opinion Mining using Principle Component Analysis Based
Ensemble Model for E-Commerce Application. CSIT. 2014; 2(3):169-179. DOI 10.1007/s40012-014-
0055-3. Spinger.
[17] Putranti ND, Winarko E. Analisis Sentiment Twitter untuk Teks Berbahasa Indonesia dengan
Maximum Entropy dan Support Vector Machine. IJCCS. 2014; 8(1): 91-100. ISSN: 1978-1520.
[18] Muis IA, Affandes M. Penerapan Metode SVM menggunakan Kernel Radial Basis Function (RBF)
pada Klasifikasi Tweet. Journal of Science Technology and Industry. 2015; 12(2): 189-197. ISSN:
1693-2390.