Nowadays, online communication is more convenient and popular than faceto-face conversation. Therefore, people prefer online communication over face-to-face meetings. Enormous people use online chatting systems to speak with their loved ones at any given time throughout the world. People create massive quantities of conversation every second because of their online engagement. People's feelings during the conversation period can be gleaned as useful information from these conversations. Text analysis and conclusion of any material as summarization can be done using sentiment analysis by natural language processing. The use of communication for customer service portals in various e-commerce platforms and crime investigations based on digital evidence is increasing the need for sentiment analysis of a conversation. Other languages, such as English, have welldeveloped libraries and resources for natural language processing, yet there are few studies conducted on Bangla. It is more challenging to extract sentiments from Bangla conversational data due to the language's grammatical complexity. As a result, it opens vast study opportunities. So, support vector machine, multinomial naïve Bayes, k-nearest neighbors, logistic regression, decision tree, and random forest was used. From the dataset, extracted information was labeled as positive and negative.
Natural Language Understanding in HealthcareDavid Talby
The ability of software to reason, answer questions and intelligently converse about clinical notes, patient stories or biomedical papers has risen dramatically in the past few years.
This talk covers state of the art natural language processing, deep learning, and machine learning libraries in this space. We'll share benchmarks from industry & research projects on use cases such as clinical data abstraction, patient risk prediction, named entity recognition & resolution, and negation scope detection.
I. Hill climbing algorithm II. Steepest hill climbing algorithmvikas dhakane
Artificial Intelligence: Introduction, Typical Applications. State Space Search: Depth Bounded
DFS, Depth First Iterative Deepening. Heuristic Search: Heuristic Functions, Best First Search,
Hill Climbing, Variable Neighborhood Descent, Beam Search, Tabu Search. Optimal Search: A
*
algorithm, Iterative Deepening A*
, Recursive Best First Search, Pruning the CLOSED and OPEN
Lists
Sentiment analysis and opinion mining is the study of people's opinions, attitudes and emotions expressed in text towards entities. It is useful for businesses and consumers. Sentiment analysis can be done at the document, sentence and entity/aspect level. At the document level, a review is classified as overall positive or negative. At the sentence level, each sentence is classified. At the entity/aspect level, the specific attributes that people liked and disliked are identified. Automated sentiment analysis is needed due to the large volume of online opinions and human biases. Challenges include sarcasm, context dependence of words and implicit opinions. Supervised and unsupervised machine learning techniques are used for classification.
This is the first of an 8 lecture series that I presented at University of Strathclyde in 2011/2012 as part of the final year AI course.
This lecture introduces the concept of a game, and the branch of mathematics known as Game Theory.
Text similarity measures are used to quantify the similarity between text strings and documents. Common text similarity measures include Levenshtein distance for word similarity and cosine similarity for document similarity. To apply cosine similarity, documents first need to be represented in a document-term matrix using techniques like count vectorization or TF-IDF. TF-IDF is often preferred as it assigns higher importance to rare terms compared to common terms.
Sarcasm Detection: Achilles Heel of sentiment analysisAnuj Gupta
1. Sarcasm detection poses a challenge for sentiment analysis systems as sarcasm involves stating the opposite sentiment from what is meant. This "Achilles heel" is important to address from both business and research perspectives.
2. The document describes a solution for sarcasm detection that uses features extracted from pretrained convolutional neural networks for sentiment analysis and emotion detection, combined with features from a baseline model.
3. Evaluation on a test set showed improved performance over the baseline models, with future work including collecting more data and exploring attention mechanisms and recurrent neural networks. Addressing sarcasm detection was presented as an important problem at the intersection of natural language processing and domain knowledge.
Natural Language Understanding in HealthcareDavid Talby
The ability of software to reason, answer questions and intelligently converse about clinical notes, patient stories or biomedical papers has risen dramatically in the past few years.
This talk covers state of the art natural language processing, deep learning, and machine learning libraries in this space. We'll share benchmarks from industry & research projects on use cases such as clinical data abstraction, patient risk prediction, named entity recognition & resolution, and negation scope detection.
I. Hill climbing algorithm II. Steepest hill climbing algorithmvikas dhakane
Artificial Intelligence: Introduction, Typical Applications. State Space Search: Depth Bounded
DFS, Depth First Iterative Deepening. Heuristic Search: Heuristic Functions, Best First Search,
Hill Climbing, Variable Neighborhood Descent, Beam Search, Tabu Search. Optimal Search: A
*
algorithm, Iterative Deepening A*
, Recursive Best First Search, Pruning the CLOSED and OPEN
Lists
Sentiment analysis and opinion mining is the study of people's opinions, attitudes and emotions expressed in text towards entities. It is useful for businesses and consumers. Sentiment analysis can be done at the document, sentence and entity/aspect level. At the document level, a review is classified as overall positive or negative. At the sentence level, each sentence is classified. At the entity/aspect level, the specific attributes that people liked and disliked are identified. Automated sentiment analysis is needed due to the large volume of online opinions and human biases. Challenges include sarcasm, context dependence of words and implicit opinions. Supervised and unsupervised machine learning techniques are used for classification.
This is the first of an 8 lecture series that I presented at University of Strathclyde in 2011/2012 as part of the final year AI course.
This lecture introduces the concept of a game, and the branch of mathematics known as Game Theory.
Text similarity measures are used to quantify the similarity between text strings and documents. Common text similarity measures include Levenshtein distance for word similarity and cosine similarity for document similarity. To apply cosine similarity, documents first need to be represented in a document-term matrix using techniques like count vectorization or TF-IDF. TF-IDF is often preferred as it assigns higher importance to rare terms compared to common terms.
Sarcasm Detection: Achilles Heel of sentiment analysisAnuj Gupta
1. Sarcasm detection poses a challenge for sentiment analysis systems as sarcasm involves stating the opposite sentiment from what is meant. This "Achilles heel" is important to address from both business and research perspectives.
2. The document describes a solution for sarcasm detection that uses features extracted from pretrained convolutional neural networks for sentiment analysis and emotion detection, combined with features from a baseline model.
3. Evaluation on a test set showed improved performance over the baseline models, with future work including collecting more data and exploring attention mechanisms and recurrent neural networks. Addressing sarcasm detection was presented as an important problem at the intersection of natural language processing and domain knowledge.
Comparative Analysis of Transformer Based Pre-Trained NLP Modelssaurav singla
The document presents a comparative analysis of BERT, RoBERTa, and ALBERT models for multi-class sentiment analysis on a non-benchmark COVID-19 tweet dataset. The models were fine-tuned with a proposed architecture and evaluated using f1-score and AUC. BERT achieved the highest f1-score of 0.85, followed by RoBERTa at 0.80 and ALBERT at 0.78, showing that BERT performed best for this task. Future work could investigate model performance at different batch sizes and dropout values to determine the best model for sentiment analysis based on both accuracy and speed.
1) The document explores safety in US states by analyzing 2012 murder data from the FBI across multiple dimensions. It finds that while California had the most murders, Washington D.C. had the highest murder rate when adjusted for population and area.
2) Breaking the data down by metropolitan vs non-metropolitan areas revealed some states are safer in their non-metro or metro regions.
3) A map view showed states along the southern border generally had higher murder rates than northern states.
4) In conclusion, deeper analysis accounting for population, area, and metro/non-metro regions provides more useful insights than just raw murder counts. Washington D.C. emerges as the most unsafe when considering multiple
This document discusses basic loops and functions in R programming. It covers control statements like loops and if/else, arithmetic and boolean operators, default argument values, and returning values from functions. It also describes R programming structures, recursion, and provides an example of implementing quicksort recursively and constructing a binary search tree. The key topics are loops, control flow, functions, recursion, and examples of sorting and binary trees.
This presentation compares four tools for analysing the sentiment in the content of free-text survey responses concerning a healthcare information website. It was completed by Despo Georgiou as part of her internship at UXLabs (http://uxlabs.co.uk)
by Jennifer Shin
Senior Principal Data Scientist, Nielsen
With more and more data being collected from consumers, finding a efficient solution to aligning data over time can become increasingly difficult and yet, even more necessary. Whether it's a change in the data collection process or an error in the system, working with big data requires tools that can account for real world complexities.
This talk with introduce the benefits and complexities of implementing a 'fuzzy' solution using the Levenshtein algorithm. Attendees will walk away with a high level understanding of fuzzy matching algorithms and learn how it can be effectively applied to solve real word business problem.
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
This document discusses data warehouses, including what they are, how they are implemented, and how they can be further developed. It provides definitions of key concepts like data warehouses, data cubes, and OLAP. It also describes techniques for efficient data cube computation, indexing of OLAP data, and processing of OLAP queries. Finally, it discusses different approaches to data warehouse implementation and development of data cube technology.
Intensity Transformation and Spatial filteringShajun Nisha
Dr. S. Shajun Nisha discusses intensity transformation and spatial filtering techniques in image processing. Intensity transformation functions modify pixel intensities based on a transformation function. Spatial filtering involves applying an operator over a neighborhood of pixels. Common intensity transformations include contrast stretching and logarithmic transforms. Histogram equalization is also described to improve contrast. Spatial filters include linear filters implemented using imfilter and non-linear filters like median filtering with ordfilt2 and medfilt2. Examples demonstrate applying these techniques to enhance images.
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
This document discusses weak slot and filler structures in artificial intelligence. It describes semantic net representation, which represents knowledge as a graphical network of nodes and arcs. It provides examples of representing statements about a cat named Jerry in a semantic net. The document also discusses frame representation, which organizes knowledge into structured records called frames that contain slots and slot values. An example frame is provided for a person named Ram. Advantages and disadvantages of both semantic nets and frames are outlined.
This document describes research on using a Long Short-Term Memory (LSTM) neural network model to predict bitcoin prices over the next 5 days. It discusses collecting bitcoin price data from 2015-2021, cleaning the data, and using features like date, price, high, low to train and test the LSTM model. Lag plots show the data has positive correlation at daily intervals. The model is trained on recent data and tested on past data to predict future prices. Root mean square error is calculated between predicted and actual test prices. The model accurately predicts future prices but could be improved by adding more price-influencing features to the training data.
Stochastic gradient descent and its tuningArsalan Qadri
This paper talks about optimization algorithms used for big data applications. We start with explaining the gradient descent algorithms and its limitations. Later we delve into the stochastic gradient descent algorithms and explore methods to improve it it by adjusting learning rates.
Chapter summary and solutions to end-of-chapter exercises for "Data Visualization: Principles and Practice" book by Alexandru C. Telea
We presented a number of fundamental methods for visualizing scalar data: color mapping, contouring, slicing, and height plots. Color mapping assigns a color as a function of the scalar value at each point of a given domain. Contouring displays all points within a given two- or three-dimensional domain that have a given scalar value. Height plots deform the scalar dataset domain in a given direction as a function of the scalar data. The main advantages of these techniques are that they produce intuitive results, easily understood by users, and they are simple to implement. However, such techniques also have s number of restrictions.
Control Strategies
Control Strategy in Artificial Intelligence
scenario is a technique or strategy, tells us about which rule has to be applied next while searching for the solution of a problem within problem space.
It helps us to decide which rule has to apply next without getting stuck at any point.
Characteristics of Control Strategies
A good Control strategy has two main
characteristics:
Control Strategy should cause Motion
Control strategy should be Systematic
Co ntrol Strategy should cause Motion
Each rule or strategy applied should cause the motion because if there will be no motion than such control strategy will never lead to a solution. Motion states about the change of state and if a state will not change then there be no movement from an initial state and we would never solve the problem.
Co ntrol Strategy should be Systematic
Though the strategy applied should create the
motion but if do not follow some systematic
strategy than we are likely to reach the same state
number of times before reaching the solution
which increases the number of steps. Taking care of only first strategy we may go through particular useless sequences of operators several times. Control Strategy should be systematic implies a need for global motion as well as for local motion.
MLDM provides an original scientific position in Europe on problems related to pattern recognition, machine learning, classification, modelling, knowledge extraction and data mining. These issues have a strong employability potential for students trained in the field of modelling, prediction or decision support, as well as in the area of the Web, image and video processing, health informatics, etc.
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information (Abeer Alzubaidi, Georgina Cosma, David Brown and Graham Pockley)
Interactive Technologies and Games (ITAG) Conference 2016
Health, Disability and EducationDates: Wednesday 26 October 2016 - Thursday 27 October 2016 Location: The Council House, NG1 2DT
Based on my review, I do not believe this exchange is
actually sarcastic. It appears to be a genuine discussion of moral
philosophy comparing different crimes.
This level of error in the original tagging (nearly 50%) likely
explains the poor performance of the models. With such a high
level of noise in the training data, it would be nearly impossible
for any model to learn to correctly identify sarcasm.
7. Conclusions and Future Work
In summary, this project attempted to use NLP techniques to build
a model that could identify sarcastic comments. Unfortunately,
the results were significantly poorer than anticipated, even with
relatively large, pre-tagged corpora.
This Presentation discusses he following topics:
Introduction
Need for Problem formulation
Problem Solving Components
Definition of Problem
Problem Limitation
Goal or Solution
Solution Space
Operators
Examples of Problem Formulation
Well-defined Problems and Solution
Examples of Well-Defined Problems
Constraint satisfaction problems (CSPs)
Examples of constraint satisfaction problem
Decision problem
A Literature Survey: Neural Networks for object detectionvivatechijri
Humans have a great capability to distinguish objects by their vision. But, for machines object
detection is an issue. Thus, Neural Networks have been introduced in the field of computer science. Neural
Networks are also called as ‘Artificial Neural Networks’ [13]. Artificial Neural Networks are computational
models of the brain which helps in object detection and recognition. This paper describes and demonstrates the
different types of Neural Networks such as ANN, KNN, FASTER R-CNN, 3D-CNN, RNN etc. with their accuracies.
From the study of various research papers, the accuracies of different Neural Networks are discussed and
compared and it can be concluded that in the given test cases, the ANN gives the best accuracy for the object
detection.
This document discusses various selection methods used in evolutionary algorithms. It describes parent selection methods like roulette wheel selection and tournament selection that determine which individuals are chosen to reproduce offspring. It also covers survivor selection/replacement methods like steady state, elitist, and generation replacement that determine which individuals survive to the next generation. The document provides details on how each method works and its advantages/disadvantages.
Genetic algorithms are inspired by Darwin's theory of natural selection and use techniques like inheritance, mutation, and selection to find optimal solutions. The document discusses genetic algorithms and their application in data mining. It provides examples of how genetic algorithms use selection, crossover, and mutation operators to evolve rules for predicting voter behavior from historical election data. The advantages are that genetic algorithms can solve complex problems where traditional search methods fail, and provide multiple solutions. Limitations include not guaranteeing a global optimum and variable optimization times. Applications include optimization, machine learning, and economic modeling.
Opinion mining on newspaper headlines using SVM and NLPIJECEIAES
Opinion Mining also known as Sentiment Analysis, is a technique or procedure which uses Natural Language processing (NLP) to classify the outcome from text. There are various NLP tools available which are used for processing text data. Multiple research have been done in opinion mining for online blogs, Twitter, Facebook etc. This paper proposes a new opinion mining technique using Support Vector Machine (SVM) and NLP tools on newspaper headlines. Relative words are generated using Stanford CoreNLP, which is passed to SVM using count vectorizer. On comparing three models using confusion matrix, results indicate that Tf-idf and Linear SVM provides better accuracy for smaller dataset. While for larger dataset, SGD and linear SVM model outperform other models.
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Andrew Parish
This document summarizes a research paper that analyzed sentiment of movie reviews written in Bangla using machine learning techniques. The researchers collected a dataset of over 4,000 Bangla movie reviews labeled as positive or negative. Using this dataset, they tested support vector machine and long short-term memory models, achieving 88.9% and 82.42% accuracy respectively. The paper also reviewed other prior work on Bangla sentiment analysis and compared different machine learning methods.
Comparative Analysis of Transformer Based Pre-Trained NLP Modelssaurav singla
The document presents a comparative analysis of BERT, RoBERTa, and ALBERT models for multi-class sentiment analysis on a non-benchmark COVID-19 tweet dataset. The models were fine-tuned with a proposed architecture and evaluated using f1-score and AUC. BERT achieved the highest f1-score of 0.85, followed by RoBERTa at 0.80 and ALBERT at 0.78, showing that BERT performed best for this task. Future work could investigate model performance at different batch sizes and dropout values to determine the best model for sentiment analysis based on both accuracy and speed.
1) The document explores safety in US states by analyzing 2012 murder data from the FBI across multiple dimensions. It finds that while California had the most murders, Washington D.C. had the highest murder rate when adjusted for population and area.
2) Breaking the data down by metropolitan vs non-metropolitan areas revealed some states are safer in their non-metro or metro regions.
3) A map view showed states along the southern border generally had higher murder rates than northern states.
4) In conclusion, deeper analysis accounting for population, area, and metro/non-metro regions provides more useful insights than just raw murder counts. Washington D.C. emerges as the most unsafe when considering multiple
This document discusses basic loops and functions in R programming. It covers control statements like loops and if/else, arithmetic and boolean operators, default argument values, and returning values from functions. It also describes R programming structures, recursion, and provides an example of implementing quicksort recursively and constructing a binary search tree. The key topics are loops, control flow, functions, recursion, and examples of sorting and binary trees.
This presentation compares four tools for analysing the sentiment in the content of free-text survey responses concerning a healthcare information website. It was completed by Despo Georgiou as part of her internship at UXLabs (http://uxlabs.co.uk)
by Jennifer Shin
Senior Principal Data Scientist, Nielsen
With more and more data being collected from consumers, finding a efficient solution to aligning data over time can become increasingly difficult and yet, even more necessary. Whether it's a change in the data collection process or an error in the system, working with big data requires tools that can account for real world complexities.
This talk with introduce the benefits and complexities of implementing a 'fuzzy' solution using the Levenshtein algorithm. Attendees will walk away with a high level understanding of fuzzy matching algorithms and learn how it can be effectively applied to solve real word business problem.
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
This document discusses data warehouses, including what they are, how they are implemented, and how they can be further developed. It provides definitions of key concepts like data warehouses, data cubes, and OLAP. It also describes techniques for efficient data cube computation, indexing of OLAP data, and processing of OLAP queries. Finally, it discusses different approaches to data warehouse implementation and development of data cube technology.
Intensity Transformation and Spatial filteringShajun Nisha
Dr. S. Shajun Nisha discusses intensity transformation and spatial filtering techniques in image processing. Intensity transformation functions modify pixel intensities based on a transformation function. Spatial filtering involves applying an operator over a neighborhood of pixels. Common intensity transformations include contrast stretching and logarithmic transforms. Histogram equalization is also described to improve contrast. Spatial filters include linear filters implemented using imfilter and non-linear filters like median filtering with ordfilt2 and medfilt2. Examples demonstrate applying these techniques to enhance images.
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
This document discusses weak slot and filler structures in artificial intelligence. It describes semantic net representation, which represents knowledge as a graphical network of nodes and arcs. It provides examples of representing statements about a cat named Jerry in a semantic net. The document also discusses frame representation, which organizes knowledge into structured records called frames that contain slots and slot values. An example frame is provided for a person named Ram. Advantages and disadvantages of both semantic nets and frames are outlined.
This document describes research on using a Long Short-Term Memory (LSTM) neural network model to predict bitcoin prices over the next 5 days. It discusses collecting bitcoin price data from 2015-2021, cleaning the data, and using features like date, price, high, low to train and test the LSTM model. Lag plots show the data has positive correlation at daily intervals. The model is trained on recent data and tested on past data to predict future prices. Root mean square error is calculated between predicted and actual test prices. The model accurately predicts future prices but could be improved by adding more price-influencing features to the training data.
Stochastic gradient descent and its tuningArsalan Qadri
This paper talks about optimization algorithms used for big data applications. We start with explaining the gradient descent algorithms and its limitations. Later we delve into the stochastic gradient descent algorithms and explore methods to improve it it by adjusting learning rates.
Chapter summary and solutions to end-of-chapter exercises for "Data Visualization: Principles and Practice" book by Alexandru C. Telea
We presented a number of fundamental methods for visualizing scalar data: color mapping, contouring, slicing, and height plots. Color mapping assigns a color as a function of the scalar value at each point of a given domain. Contouring displays all points within a given two- or three-dimensional domain that have a given scalar value. Height plots deform the scalar dataset domain in a given direction as a function of the scalar data. The main advantages of these techniques are that they produce intuitive results, easily understood by users, and they are simple to implement. However, such techniques also have s number of restrictions.
Control Strategies
Control Strategy in Artificial Intelligence
scenario is a technique or strategy, tells us about which rule has to be applied next while searching for the solution of a problem within problem space.
It helps us to decide which rule has to apply next without getting stuck at any point.
Characteristics of Control Strategies
A good Control strategy has two main
characteristics:
Control Strategy should cause Motion
Control strategy should be Systematic
Co ntrol Strategy should cause Motion
Each rule or strategy applied should cause the motion because if there will be no motion than such control strategy will never lead to a solution. Motion states about the change of state and if a state will not change then there be no movement from an initial state and we would never solve the problem.
Co ntrol Strategy should be Systematic
Though the strategy applied should create the
motion but if do not follow some systematic
strategy than we are likely to reach the same state
number of times before reaching the solution
which increases the number of steps. Taking care of only first strategy we may go through particular useless sequences of operators several times. Control Strategy should be systematic implies a need for global motion as well as for local motion.
MLDM provides an original scientific position in Europe on problems related to pattern recognition, machine learning, classification, modelling, knowledge extraction and data mining. These issues have a strong employability potential for students trained in the field of modelling, prediction or decision support, as well as in the area of the Web, image and video processing, health informatics, etc.
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information (Abeer Alzubaidi, Georgina Cosma, David Brown and Graham Pockley)
Interactive Technologies and Games (ITAG) Conference 2016
Health, Disability and EducationDates: Wednesday 26 October 2016 - Thursday 27 October 2016 Location: The Council House, NG1 2DT
Based on my review, I do not believe this exchange is
actually sarcastic. It appears to be a genuine discussion of moral
philosophy comparing different crimes.
This level of error in the original tagging (nearly 50%) likely
explains the poor performance of the models. With such a high
level of noise in the training data, it would be nearly impossible
for any model to learn to correctly identify sarcasm.
7. Conclusions and Future Work
In summary, this project attempted to use NLP techniques to build
a model that could identify sarcastic comments. Unfortunately,
the results were significantly poorer than anticipated, even with
relatively large, pre-tagged corpora.
This Presentation discusses he following topics:
Introduction
Need for Problem formulation
Problem Solving Components
Definition of Problem
Problem Limitation
Goal or Solution
Solution Space
Operators
Examples of Problem Formulation
Well-defined Problems and Solution
Examples of Well-Defined Problems
Constraint satisfaction problems (CSPs)
Examples of constraint satisfaction problem
Decision problem
A Literature Survey: Neural Networks for object detectionvivatechijri
Humans have a great capability to distinguish objects by their vision. But, for machines object
detection is an issue. Thus, Neural Networks have been introduced in the field of computer science. Neural
Networks are also called as ‘Artificial Neural Networks’ [13]. Artificial Neural Networks are computational
models of the brain which helps in object detection and recognition. This paper describes and demonstrates the
different types of Neural Networks such as ANN, KNN, FASTER R-CNN, 3D-CNN, RNN etc. with their accuracies.
From the study of various research papers, the accuracies of different Neural Networks are discussed and
compared and it can be concluded that in the given test cases, the ANN gives the best accuracy for the object
detection.
This document discusses various selection methods used in evolutionary algorithms. It describes parent selection methods like roulette wheel selection and tournament selection that determine which individuals are chosen to reproduce offspring. It also covers survivor selection/replacement methods like steady state, elitist, and generation replacement that determine which individuals survive to the next generation. The document provides details on how each method works and its advantages/disadvantages.
Genetic algorithms are inspired by Darwin's theory of natural selection and use techniques like inheritance, mutation, and selection to find optimal solutions. The document discusses genetic algorithms and their application in data mining. It provides examples of how genetic algorithms use selection, crossover, and mutation operators to evolve rules for predicting voter behavior from historical election data. The advantages are that genetic algorithms can solve complex problems where traditional search methods fail, and provide multiple solutions. Limitations include not guaranteeing a global optimum and variable optimization times. Applications include optimization, machine learning, and economic modeling.
Opinion mining on newspaper headlines using SVM and NLPIJECEIAES
Opinion Mining also known as Sentiment Analysis, is a technique or procedure which uses Natural Language processing (NLP) to classify the outcome from text. There are various NLP tools available which are used for processing text data. Multiple research have been done in opinion mining for online blogs, Twitter, Facebook etc. This paper proposes a new opinion mining technique using Support Vector Machine (SVM) and NLP tools on newspaper headlines. Relative words are generated using Stanford CoreNLP, which is passed to SVM using count vectorizer. On comparing three models using confusion matrix, results indicate that Tf-idf and Linear SVM provides better accuracy for smaller dataset. While for larger dataset, SGD and linear SVM model outperform other models.
Analyzing Sentiment Of Movie Reviews In Bangla By Applying Machine Learning T...Andrew Parish
This document summarizes a research paper that analyzed sentiment of movie reviews written in Bangla using machine learning techniques. The researchers collected a dataset of over 4,000 Bangla movie reviews labeled as positive or negative. Using this dataset, they tested support vector machine and long short-term memory models, achieving 88.9% and 82.42% accuracy respectively. The paper also reviewed other prior work on Bangla sentiment analysis and compared different machine learning methods.
A scalable, lexicon based technique for sentiment analysisijfcstjournal
Rapid increase in the volume of sentiment rich social media on the web has resulted in an increased
interest among researchers regarding Sentimental Analysis and opinion mining. However, with so much
social media available on the web, sentiment analysis is now considered as a big data task. Hence the
conventional sentiment analysis approaches fails to efficiently handle the vast amount of sentiment data
available now a days. The main focus of the research was to find such a technique that can efficiently
perform sentiment analysis on big data sets. A technique that can categorize the text as positive, negative
and neutral in a fast and accurate manner. In the research, sentiment analysis was performed on a large
data set of tweets using Hadoop and the performance of the technique was measured in form of speed and
accuracy. The experimental results shows that the technique exhibits very good efficiency in handling big
sentiment data sets.
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
For one dimensional homogeneous, isotropic aquifer, without accretion the governing Boussinesq
equation under Dupuit assumptions is a nonlinear partial differential equation. In the present paper
approximate analytical solution of nonlinear Boussinesq equation is obtained using Homotopy
perturbation transform method(HPTM). The solution is compared with the exact solution. The
comparison shows that the HPTM is efficient, accurate and reliable. The analysis of two important aquifer
parameters namely viz. specific yield and hydraulic conductivity is studied to see the effects on the height
of water table. The results resemble well with the physical phenomena.
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
The document describes a proposed model for sentiment analysis of movie reviews using natural language processing and machine learning approaches. The model first applies various data pre-processing techniques to the dataset, including tokenization, pruning, filtering tokens, and stemming. It then investigates the performance of classifiers like Naive Bayes and SVM combined with different feature selection schemes, including term occurrence, binary term occurrence, term frequency and TF-IDF. Experiments are run using n-grams up to 4-grams to determine the best approach for sentiment analysis.
Sensing complicated meanings from unstructured data: a novel hybrid approachIJECEIAES
The majority of data on computers nowadays is in the form of unstructured data and unstructured text. The inherent ambiguity of natural language makes it incredibly difficult but also highly profitable to find hidden information or comprehend complex semantics in unstructured text. In this paper, we present the combination of natural language processing (NLP) and convolution neural network (CNN) hybrid architecture called automated analysis of unstructured text using machine learning (AAUT-ML) for the detection of complex semantics from unstructured data that enables different users to make understand formal semantic knowledge to be extracted from an unstructured text corpus. The AAUT-ML has been evaluated using three datasets data mining (DM), operating system (OS), and data base (DB), and compared with the existing models, i.e., YAKE, term frequency-inverse document frequency (TF-IDF) and text-R. The results show better outcomes in terms of precision, recall, and macro-averaged F1-score. This work presents a novel method for identifying complex semantics using unstructured data.
The sarcasm detection with the method of logistic regressionEditorIJAERD
The document discusses sarcasm detection using logistic regression. It compares the performance of logistic regression and SVM classification for sarcasm detection. Logistic regression achieved higher accuracy of 93.5% for sarcasm detection, with lower execution time compared to SVM classification. The proposed approach uses data preprocessing, feature extraction using N-grams, and trains a logistic regression classifier on a manually labeled dataset to classify text as sarcastic or non-sarcastic. Accuracy and execution time analysis shows logistic regression performs better than SVM for this task.
Review of Sentiment Analysis: An Hybrid Approach IIJSRJournal
Sentiment analysis is acknowledged as detecting thoughts used from field content features additionally it's recognized while one linked to the main parts of standpoint extraction. Through this type of process, we will be able to discover if a movie script is positive, negative, or natural. Using this research, a feeling examination is executed along with calvados data. The text message sensation analyzer combines organic and natural language processing (NLP) and even machine studying techniques to provide measured assessment rankings to be able to entities, subjects, themes, and groups in a term or key phrase. Inside expressing feelings, the particular polarity of calvados written content reviews can always be graded for the damaging to good range utilizing the education algorithm. The certain current decade presents seen substantial improvements in artificial brains; along with the device mastering revolution offers converted the complete AI sector. In the end, unit learning techniques include grown to always be an important aspect of any design and style in today's absorbing world. However, this ensemble of researching techniques promises for anyone who is part of motorization using the removal of common regulations for textual written content message and sentiment category activities. This kind of particular thesis has to style and carry out a good superior functionality matrix employing ensemble studying intended for sentiment category while well as software. With this paper, we possess analyzed the well-known techniques adopted within the classical Emotion Analysis problem associated with analyzing Elections evaluations like; Support Vector Machine (SVM) and Linear Regression (LR) for the effective detection of sentiments from the dataset obtained from the Kaggle machine learning repository.
Analyzing sentiment system to specify polarity by lexicon-basedjournalBEEI
Currently, sentiment analysis into positive or negative getting more attention from the researchers. With the rapid development of the internet and social media have made people express their views and opinion publicly. Analyzing the sentiment in people views and opinion impact many fields such as services and productions that companies offer. Movie reviewer needs many processing to be prepared to detect emotion, classify them and achieve high accuracy. The difficulties arise due of the structure and grammar of the language and manage the dictionary. We present a system that assigns scores indicating positive or negative opinion to each distinct entity in the text corpus. Propose an innovative formula to compute the polarity score for each word occurring in the text and find it in positive dictionary or negative dictionary we have to remove it from text. After classification, the words are stored in a list that will be used to calculate the accuracy. The results reveal that the system achieved the best results in accuracy of 76.585%.
A simplified classification computational model of opinion mining using deep ...IJECEIAES
Opinion and attempts to develop an automated system to determine people's viewpoints towards various units such as events, topics, products, services, organizations, individuals, and issues. Opinion analysis from the natural text can be regarded as a text and sequence classification problem which poses high feature space due to the involvement of dynamic information that needs to be addressed precisely. This paper introduces effective modelling of human opinion analysis from social media data subjected to complex and dynamic content. Firstly, a customized preprocessing operation based on natural language processing mechanisms as an effective data treatment process towards building quality-aware input data. On the other hand, a suitable deep learning technique, bidirectional long short term-memory (Bi-LSTM), is implemented for the opinion classification, followed by a data modelling process where truncating and padding is performed manually to achieve better data generalization in the training phase. The design and development of the model are carried on the MATLAB tool. The performance analysis has shown that the proposed system offers a significant advantage in terms of classification accuracy and less training time due to a reduction in the feature space by the data treatment operation.
This paper presents a framework called FILTWAM for real-time emotion recognition in e-learning environments using webcams. FILTWAM can recognize emotions from facial expressions and provide timely feedback. It was tested in a proof of concept study where 10 participants mimicked facial expressions corresponding to basic emotions. Video recordings were analyzed by experts and the software, showing the software achieved an overall accuracy of 72% in recognizing emotions from facial expressions. The study validated the use of webcam data for real-time interpretation of emotions during e-learning.
The Identification of Depressive Moods from Twitter Data by Using Convolution...IRJET Journal
The document presents research on identifying depressive moods on Twitter using a convolutional neural network model. Key points:
- A CNN model was developed using text and emoji representations from Twitter data to predict depressive moods.
- The model achieved 88% accuracy on the test data, demonstrating it could correctly classify sentiments most of the time.
- The model was more successful at identifying negative sentiments than positive ones, based on higher precision and recall rates for negative classes.
- Integrating both text and emoji data enhanced the model's performance, showing potential for more nuanced sentiment analysis.
Enhanced sentiment analysis based on improved word embeddings and XGboost IJECEIAES
Sentiment analysis is a well-known and rapidly expanding study topic in natural language processing (NLP) and text classification. This approach has evolved into a critical component of many applications, including politics, business, advertising, and marketing. Most current research focuses on obtaining sentiment features through lexical and syntactic analysis. Word embeddings explicitly express these characteristics. This article proposes a novel method, improved words vector for sentiments analysis (IWVS), using XGboost to improve the F1-score of sentiment classification. The proposed method constructed sentiment vectors by averaging the word embeddings (Sentiment2Vec). We also investigated the Polarized lexicon for classifying positive and negative sentiments. The sentiment vectors formed a feature space to which the examined sentiment text was mapped to. Those features were input into the chosen classifier (XGboost). We compared the F1-score of sentiment classification using our method via different machine learning models and sentiment datasets. We compare the quality of our proposition to that of baseline models, term frequency-inverse document frequency (TF-IDF) and Doc2vec, and the results show that IWVS performs better on the F1-measure for sentiment classification. At the same time, XGBoost with IWVS features was the best model in our evaluation.
IRJET- Sentimental Analysis on Audio and Video using Vader Algorithm -Monali ...IRJET Journal
This document presents a proposed system for performing sentiment analysis on audio and video reviews from social media platforms. The system first collects audio and video data from sites like YouTube and Facebook. It then separates the audio and video files, converts them to .wav format, and extracts text from the audio and video files. This extracted text is then analyzed using the VADER sentiment analysis algorithm to determine the sentiment polarity (positive, negative, neutral) expressed in the text. VADER is a lexicon-based approach that rates words based on sentiment and calculates overall sentiment scores. The proposed system aims to analyze sentiment in audio and video reviews to better understand user opinions expressed across various social media platforms.
Sentiment Analysis of Bengali text using Gated Recurrent Neural NetworkA. Hasib Uddin
Sentiment analysis is a fundamental part of Natural Language
Processing. There are numerous works on this topic in English and other
languages. However, it is still a comparatively new practice in Bangla. The
absence of a suitable Bangla corpus is the primary obstacle for sentiment
analysis tasks in Bangla. Nonetheless, Long Short-term Memory (LSTM) is a
common technique for resolving sentiments from a dataset containing a large
amount of text data. However, Gated Recurrent Unit (GRU) is very efficient for
datasets with a low amount of text data. In this manuscript, we present a 5-
layered GRU neural network model, each layer comprising of 48 neurons,
applied the model on an existing Bangla corpus. We implemented the 10-folds
cross-validation approach and repeated the same processes three times. Each
time, we considered the averages of the ten validation accuracy and losses and
compared the results with the state-of-the-art published outcome (77.85%
highest accuracy) for Bi-directional LSTM (BLSTM). The highest accuracies
for our model was 78.41%, while the lowest accuracy was 76.34%.
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...IJECEIAES
This document summarizes a research paper that proposes using a tree-based pipeline optimization tool (TPOT) to improve sentiment classification of dialectal Arabic texts. The paper provides background on sentiment analysis and challenges in analyzing informal Arabic texts. It then discusses related work applying TPOT and AutoML techniques to optimize machine learning for various tasks. The proposed approach uses TPOT for sentiment analysis of three Arabic dialect datasets to automatically optimize hyperparameters and improve over similar prior work.
Framework for opinion as a service on review data of customer using semantics...IJECEIAES
At opinion mining plays a significant role in representing the original and unbiased perception of the products/services. However, there are various challenges associated with performing an effective opinion mining in the present era of distributed computing system with dynamic behaviour of users. Existing approaches is more laborious towards extracting knowledge from the reviews of user which is further subjected to various rounds of operation with complex procedures. The proposed system addresses the problem by introducing a novel framework called as opinion-as-a-service which is meant for direct utilization of the extracted knowledge in most user friendly manner. The proposed system introduces a set of three sequential algorithm that performs aggregated of incoming stream of opinion data, performing indexing, followed by applying semantics for extracting knowledge. The study outcome shows that proposed system is better than existing system in mining performance.
A Subjective Feature Extraction For Sentiment Analysis In Malayalam LanguageJeff Nelson
The document discusses sentiment analysis of Malayalam film reviews using machine learning techniques. It proposes using Conditional Random Fields combined with rule-based approaches for sentiment analysis at the sentence and document level in Malayalam. The system is trained on a manually tagged corpus of over 30,000 tokens and tested on film reviews to determine the overall polarity (positive, negative, neutral) and rating of individual categories like film, direction, acting etc. The system achieved an accuracy of 82% in identifying sentiment and ratings.
Application Of Sentiment Lexicons On Movies Transcripts To Detect Violence In...Sara Alvarez
The document summarizes a research paper that used sentiment analysis techniques to detect violence in video transcripts. It applied two sentiment lexicons (English SentiWordNet and Vader Package) to 100 annotated video transcripts to classify videos as violent or non-violent. The Vader Package achieved 75% accuracy, outperforming English SentiWordNet which achieved 66% accuracy when applying part-of-speech tagging to all words, and 58% accuracy when only applying it to adjectives. The document also reviews related work on sentiment analysis, violence detection, and using video transcripts for tasks like genre classification and emotion detection.
Depression prognosis using natural language processing and machine learning ...IJECEIAES
Depression is an acute problem throughout the world. Due to worst and prolong depression many people dies in every year. The problem is that most of the people are not concern of the fact that they are suffering from depression. In this research, our aim was to find out whether an individual is depressed or not by analyzing social media status. Therefore, we focused on real data. Our dataset consists of 2,000 sentences, which was collected from different social media platforms Facebook, Twitter, and Instagram. Then, we have performed five data pre-processing approaches for natural language processing (NLP) such as tokenization, removal of stop words, removing empty string, removing punctuations, stemming and lemmatization. For our selected model, we considered that processed data as an input. Finally, we applied six machine learning (ML) classifiers multinomial Naive Bayes (NB), logistic regression, liner support vector classifier, random forest, K-nearest neighbour, and decision tree to achieve better accuracy over our dataset. Among six algorithms, multinomial NB and logistic regression performed well on our dataset and obtained 98% accuracy.
Similar to Sentiment analysis on Bangla conversation using machine learning approach (20)
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Neural network optimizer of proportional-integral-differential controller par...IJECEIAES
Wide application of proportional-integral-differential (PID)-regulator in industry requires constant improvement of methods of its parameters adjustment. The paper deals with the issues of optimization of PID-regulator parameters with the use of neural network technology methods. A methodology for choosing the architecture (structure) of neural network optimizer is proposed, which consists in determining the number of layers, the number of neurons in each layer, as well as the form and type of activation function. Algorithms of neural network training based on the application of the method of minimizing the mismatch between the regulated value and the target value are developed. The method of back propagation of gradients is proposed to select the optimal training rate of neurons of the neural network. The neural network optimizer, which is a superstructure of the linear PID controller, allows increasing the regulation accuracy from 0.23 to 0.09, thus reducing the power consumption from 65% to 53%. The results of the conducted experiments allow us to conclude that the created neural superstructure may well become a prototype of an automatic voltage regulator (AVR)-type industrial controller for tuning the parameters of the PID controller.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
A review on features and methods of potential fishing zoneIJECEIAES
This review focuses on the importance of identifying potential fishing zones in seawater for sustainable fishing practices. It explores features like sea surface temperature (SST) and sea surface height (SSH), along with classification methods such as classifiers. The features like SST, SSH, and different classifiers used to classify the data, have been figured out in this review study. This study underscores the importance of examining potential fishing zones using advanced analytical techniques. It thoroughly explores the methodologies employed by researchers, covering both past and current approaches. The examination centers on data characteristics and the application of classification algorithms for classification of potential fishing zones. Furthermore, the prediction of potential fishing zones relies significantly on the effectiveness of classification algorithms. Previous research has assessed the performance of models like support vector machines, naïve Bayes, and artificial neural networks (ANN). In the previous result, the results of support vector machine (SVM) were 97.6% more accurate than naive Bayes's 94.2% to classify test data for fisheries classification. By considering the recent works in this area, several recommendations for future works are presented to further improve the performance of the potential fishing zone models, which is important to the fisheries community.
Electrical signal interference minimization using appropriate core material f...IJECEIAES
As demand for smaller, quicker, and more powerful devices rises, Moore's law is strictly followed. The industry has worked hard to make little devices that boost productivity. The goal is to optimize device density. Scientists are reducing connection delays to improve circuit performance. This helped them understand three-dimensional integrated circuit (3D IC) concepts, which stack active devices and create vertical connections to diminish latency and lower interconnects. Electrical involvement is a big worry with 3D integrates circuits. Researchers have developed and tested through silicon via (TSV) and substrates to decrease electrical wave involvement. This study illustrates a novel noise coupling reduction method using several electrical involvement models. A 22% drop in electrical involvement from wave-carrying to victim TSVs introduces this new paradigm and improves system performance even at higher THz frequencies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Sentiment analysis on Bangla conversation using machine learning approach
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 12, No. 5, October 2022, pp. 5562~5572
ISSN: 2088-8708, DOI: 10.11591/ijece.v12i5.pp5562-5572 5562
Journal homepage: http://ijece.iaescore.com
Sentiment analysis on Bangla conversation using machine
learning approach
Mahmudul Hassan1
, Shahriar Shakil1
, Nazmun Nessa Moon1
, Mohammad Monirul Islam1
,
Refath Ara Hossain1
, Asma Mariam1
, Fernaz Narin Nur2
1
Department of Computer Science and Engineering, Faculty of Science and Information Technology, Daffodil International University,
Dhaka, Bangladesh
2
Department of Computer Science and Engineering, Faculty of Science and Information Technology, Notre Dame University,
Dhaka, Bangladesh
Article Info ABSTRACT
Article history:
Received Jun 10, 2021
Revised May 23, 2022
Accepted Jun 20, 2022
Nowadays, online communication is more convenient and popular than face-
to-face conversation. Therefore, people prefer online communication over
face-to-face meetings. Enormous people use online chatting systems to
speak with their loved ones at any given time throughout the world. People
create massive quantities of conversation every second because of their
online engagement. People's feelings during the conversation period can be
gleaned as useful information from these conversations. Text analysis and
conclusion of any material as summarization can be done using sentiment
analysis by natural language processing. The use of communication for
customer service portals in various e-commerce platforms and crime
investigations based on digital evidence is increasing the need for sentiment
analysis of a conversation. Other languages, such as English, have well-
developed libraries and resources for natural language processing, yet there
are few studies conducted on Bangla. It is more challenging to extract
sentiments from Bangla conversational data due to the language's
grammatical complexity. As a result, it opens vast study opportunities. So,
support vector machine, multinomial naïve Bayes, k-nearest neighbors,
logistic regression, decision tree, and random forest was used. From the
dataset, extracted information was labeled as positive and negative.
Keywords:
Accuracy rate
Detection approach
Natural language processing
Sentiment analysis
Support vector machine
Tokenizer
This is an open access article under the CC BY-SA license.
Corresponding Author:
Nazmun Nessa Moon
Department of Computer Science and Engineering, Daffodil International University
Dhaka 1207, Bangladesh
Email: moon@daffodilvarsity.edu.bd
1. INTRODUCTION
People have conversations in their daily life. People express their feelings and opinions in their
conversations. These feelings and opinions can be categorized into sad, anger, happy, worried, disgusted,
frightened, complement, motivation, suggestions, and neutral [1]. To detect subjective information such as
opinions, attitudes, and feelings expressed in text Sentiment analysis or opinion mining aims to use
automated tools [2]. In our research work we merged them into two main categories of positive and negative
[3]. Sentiment analysis can be done by capturing both semantic and sentiment similarities among words [4].
Our model can identify whether a part of any conversation is positive or negative. These two categories
expose the sentiment of the people who said it. Analyzing sentiment from people’s speech is a tough job
because in a single sentence people can express various types of sentiment at the same time. Only the people
who listen to it, can understand the sentiment properly. Our proposed model can extract sentiment from
2. Int J Elec & Comp Eng ISSN: 2088-8708
Sentiment analysis on Bangla conversation using machine learning approach (Mahmudul Hassan)
5563
people’s conversation with a closer accuracy of real life. In this research work we proposed a model that can
extract sentiment from conversation as positive or negative sentiment. To pursue that we split our dataset into
80:20 ratio. For training purposes, we used 80% data and for testing purposes we used 20% data. It helps to
increase the accuracy of the model. Based on the training dataset the accuracy of the model fully depends on
the training dataset. We have used some techniques such as changing the parameters of machine learning
models to get more accurate results. We achieved about 86% accuracy on the support vector machine. Rest of
the algorithms perform closely to the highest accuracy.
2. LITERATURE REVIEW
Extracting sentiment from Bangla conversational data is a method for determining if a conversation
is positive or negative. Bhowmik et al. [5] developed deep learning models for Sentiment analysis on Bangla
text using an extended lexical data set. They employed the rule-based Bangla text sentiment score system to
extract polarity from large texts. These polarities, along with the pre-processed text, are then used as training
samples by the neural network. The pre-processed texts are displayed as a vectorization of words derived
from pre-trained word embedding models with various word counts. A Word2Vec matrix containing the top
highest probability word is used as a weighted matrix on the embedding layer to fit the deep learning models.
This paper also includes a thorough examination of selective deep learning models, as well as some fine-
tuning. Their proposed hierarchical approach was accurate to the tune of 78.52 percent, 80.82 percent, and
84.18 percent, respectively. According to Aurpa et al. [6] certain items, such as threats and sexual
harassment, were more accessible than traditional media. Harassment, vulgarity, personal assaults, and
bullying can all occur because of extremely toxic internet content. Bangla's use of Facebook has risen in
recent years due to its status as the world's seventh most spoken language. The use of offensive comments in
Bangla on Facebook has also grown significantly, but there is little research on the subject. They focus on
recognizing abusive Bangla language remarks on social media (Facebook) that can be filtered out in the early
phases of social media attachment in this study. To classify hostile comments quickly and accurately,
transformer-based deep neural network models were used. They employed pre-training language
architectures bidirectional encoder representations from transformers (BERT) and efficiency learning an
encoder that accurately classifies token replacements (ELECTRA). The average accuracy, precision, recall,
and f1-score were used to assess the proposed models. The results have revealed that our BERT and
ELECTRA architectures are performing admirably, with test accuracy of 85.00 percent and 84.92 percent,
respectively. Rahib et al. [7] conducted this study to investigate how Bangladeshis are reacting to and dealing
with the coronavirus disease (COVID-19) scenario. In this investigation, the status and comments on
COVID-19 concerns were gathered from multiple Facebook pages and YouTube channels run by reputable
Bangladeshi news organizations and health specialists. Throughout the study, a variety of machine learning
algorithms were studied, ranging from conventional algorithms like support vector machine and random
forest to deep learning algorithms like convolutional neural networks and long short-term memory.
Experiments were carried out on a 10,581-data-point categorized data set belonging to the authors. When
evaluating the performance of various models in terms of model assessment, the results demonstrate that long
short-term memory exceeds all of them, with an accuracy of 84.92 percent. To detect the polarity of textual
Facebook posts in Bangla containing people's points of view on Bangladesh Cricket, Faruque et al. [8]
proposed a sentiment polarity detection approach that uses three popular supervised machine learning
algorithms: naive Bayes (NB), support vector machines (SVM), and logistic regression (LR). With an
accuracy of 83 percent when considering n-gram as a feature, LR outperformed SVM and NB. Iqbal et al. [9]
proposed a four-step process for categorizing six emotions in Bengali literature, including data crawling, pre-
processing, labelling, and verification, with 7,000 texts labeled into six basic emotion groups. The dataset is
graded with a score of 0.969. Cohen's score reflects the close collaboration between corpus annotators and
experts. According to the analysis of appraisal, the distribution of emotion words also follows Zipf's law. The
BEmoC study's findings were also presented in terms of coding consistency, emotion density, and the most
utilized emotion words.
Shetu et al. [10] established a paradigm for parsing text data in paragraphs. To extract sentiment
from a text, they employed the bag of words method and lexical analysis method. Mamun et al. [11]
demonstrated that the ensemble approach (i.e., logistic regression+random forest+support vector machine)
with frequency-inverse document frequency (unigram+bi-gram+trigram) features outperformed the other
classifier models on the developed dataset, achieving the highest accuracy of 82 percent. Most of the
emotions conveyed on social media platforms are expressed through writing (such as status, tweets,
comments, and reviews). presents an ensemble-based method for categorizing Bengali textual sentiment into
positive and negative categories. Because the Bengali sentiment corpus was unavailable, this effort
additionally created a dataset called "Bengali sentiment analysis dataset". Neethu and Rajasree [12]
attempted to assess the sentiment of Twitter posts in a particular domain. They suggested a new feature
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 5, October 2022: 5562-5572
5564
vector that can differentiate between positive and negative sentiment in tweets. In order to examine twitter
data for sentiment analysis, Jain and Dandannavar [13] used naive Bayes and decision tree machine learning
methods. Because it is scalable and fast, their proposed model employs Apache Spark. Rahman and Dey [14]
provide two freely accessible Bangla datasets for sentiment analysis based on aspects. One dataset contains
user comments regarding cricket that have been human-annotated, while the other features restaurant
customer reviews. They also presented a fundamental method for analyzing our datasets utilizing the aspect
category extraction subtask.
3. RESEARCH METHOD
Research section will illustrate the overall architecture of our proposed system. The research method
is listed in Figure 1 as data collection, data pre-processing, model selection, statistical analysis, and its
implementation will be discussed in this portion. In Figure 1 the full method at a glance is shown.
Figure 1. Method at a glance
3.1. Data collection procedure
From various Bangla movies and short film scripts, we collected conversation data for our research
work. These conversations covered a large scale of topics like food, family, motivation, fraud, business, and
friends. After analyzing those collected data, we will split it into two categories: positive and negative. We
have collected about 1,141 data. These conversations include emotions like happy, sad, anger, worried, and
afraid. These categories help us to differentiate the whole dataset into two main categories of Positive and
Negative. Among 1,141 data there was 570 data for positive sentiment and for negative it was 571 data.
Figures 2 and 3 shows the sample dataset.
Figure 2. Sample data
4. Int J Elec & Comp Eng ISSN: 2088-8708
Sentiment analysis on Bangla conversation using machine learning approach (Mahmudul Hassan)
5565
Figure 3. Class label distribution
3.2. Data preprocessing and organizing
Firstly, we collect data from scripts and store them into an xlsx file. The dataset we have collected
has two attributes. These are positive and negative. As we already discussed, we collect data from movie and
short film scripts as conversation. Every conversation starts with a single word or single sentence. People can
express their feelings, emotions, and thoughts through a single word or sentence. To classify these
expressions into two main attributes we merged happiness, joy, motivation, and thankfulness into positive
conversations, and for negative conversation we merged sad, anger, backbiting, and worries. During
pre-processing, we remove punctuation in the first step. In natural language processing, for every language, it
is essential to identify and remove stop words. For our research work, we have collected Bangla stop words
and removed them to clean our data. There were about 410 stop words in the Bangla language. For example:
‘অতএব’, ‘অথচ’, ‘এই’, ‘একই’, ‘একটি’, ‘হয়’, ‘হয়ততো’, ‘ককন্তু’, ‘কী’, and ‘কক’. Here, Figure 4. shows the python
code for removing Bangla stop words and punctuations and Figure 5. shows the cleaned data what we
pre-processed.
Figure 4. Removing stop words and punctuations
Figure 5. Cleaned data
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 5, October 2022: 5562-5572
5566
To extract features from each of the conversations, several words and a number of characters are
needed. Figure 6 shows the result, respectively. After preprocessing procedure label encoding method applied
to the sentiment column. And then a pickle file generated. The pickle file contains temporary data for reuse
and also saves time during runtime execution. In this work, our cleaned data is stored as a pickle file for
upcoming procedures. We need to demonstrate our dataset data where highlights are age, occupation, house
type, want to switch jobs and we are giving low highlighting to other attributes. In Figure 7, cleaned data
along with counts of each conversation length and character is shown.
Figure 6. Word frequency and character frequency
Figure 7. Sample of cleaned dataset
6. Int J Elec & Comp Eng ISSN: 2088-8708
Sentiment analysis on Bangla conversation using machine learning approach (Mahmudul Hassan)
5567
3.3. Machine learning algorithms and statistical analysis
About 571 records for positive and 570 records are for negative conversations in our dataset. For the
dataset splitting purpose we used train-test split function. We followed supervised machine learning
techniques. To train our model we used 80% of our data and for test 20% of data used. In number, 912 data
used for trains and 229 data used for test purposes. To know the accuracy on our dataset we applied some
classifier-based algorithms. These are support vector machine, multinomial naïve Bayes, k-nearest neighbors,
logistic regression, decision tree, random forest, and stochastic gradient descent. In Figure 8, we have shown
that how we have done our research shortly details.
Figure 8. Proposed model structure
3.3.1. Feature extraction
We employ machine learning methods here to achieve natural language processing goals. Our model
is trained by extracting all characteristics of each phrase from two primary characteristics. A method called
tokenizer is presented here for this technique. Tokenizer divides phrases into words parts. These unique and
common words have identical properties. In addition, TF-IDF is also such a numerical figure that examines
the requirement of a term in a text. This approach is used by some important publications for several
languages. Their success inspired us, and we found that our learning algorithms were the most accurate.
3.3.2. Classifier algorithms
It builds numerous decision trees during training. The naïve Bayes classification presupposes that
there is no connection between the existence of a certain characteristic in a class and the presence of any
other characteristic. This model is straightforward to create and beneficial for very big datasets in particular.
Naïve Bayes even exceeds advanced categorization algorithms. The logical regression model may create a
probability model from a class or event. To decide, for example, one group of images including photographs
of different animals which may be investigated on a model of various classes. Stochastic gradient descent is
renowned for improving any method transmitted particularly in machine learning algorithms in order to
identify associated model parameters for both expected and actual results.
4. EXPERIMENTAL RESULT AND ANALYSIS
In this modern era, in intelligent analyzing of data and developing the related smart applications, the
understanding of IoT [15]–[17], cyber-security [18], in particular, machine learning and deep learning
[19]–[25] are crucial. According to our requirement, we update our model and dataset using machine learning
approach. From this modification, we can accomplish that our used classifier is exactly usable for a wide
range of use according to our dataset. As per our expectations, we achieved 86% accuracy from our proposed
mode which is a fruitful outcome. This performance of the model creates a path to think about the
improvement in results.
The research result was focused to identify whether a conversation is positive or negative. We have
applied classifiers based on different machine learning models to extract the conversation type. The result has
two criteria of positive and negative. There were 1141 data for training each of the models. We get various
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 5, October 2022: 5562-5572
5568
accuracy on different models. Among 7 models the support vector machine and multinomial naive Bayes
perform well with the highest accuracy. As we already discussed, we collect data from scripts as a
conversation. All conversations have people's emotions like happy, sad, worried, annoyed, and motivated.
We merged and categorized them into two main types, positive and negative. The decision-making capability
of the classifiers was measured by their performance. Accuracy, precision, recall, and F-score were used to
determine the performance of classifiers. For a classifier, the overall accuracy was considered an adequate
standard. In the test set, it is necessary to have a notion of the correctly classified samples.
In Table 1 the accuracy scores obtained for the classifiers built are given. Here it is clear that the
support vector machine gives the highest accuracy score of 0.85589 and multinomial naive Bayes gives
almost similar accuracy of 0.8513. That is why it was needed to calculate the other performance measures to
decide a suitable classifier for our dataset.
To measure the class agreement of the data labels with the positive labels given by the classifier the
precision is used. We have to calculate the precision scores for each of the two-class labels because it is
directly relevant to class labels. In Table 2 the values for each of the classifiers are given along with the 2
labels we used in this research work. We can see that the classifier random forest gives a score of 0.93 and
multinomial naive Bayes gives 0.85 for positive conversation.
To identify class labels recall is known as sensitivity of the measurement that represents the
effectiveness of the classifier. We also concentrated on achieving a score near 1 for the positive class label.
The recall scores for two-class labels and classifiers are reported in Table 3. The decision tree and support
vector machine had a recall score of 0.92 for positive dialogue. F1-score can be used to determine the
relationship between positive labels and those provided by the classifier. The harmonic means of precision
and recall for all two labels across all classifiers can be used to calculate it. The score close to 1 for the
positive class label was considered when determining the optimum model of classifier. Table 4 shows the F1
scores for the class labels. Vector machines and multinomial naïve classifiers are supported by the classifiers.
Bayes and stochastic gradient descent are the most effective methods for determining the best classifier for
our dataset.
Table 1. Accuracy of classifiers
Classifier Accuracy
random forest 74.24%
decision tree 76.42%
logistic regression 82.53%
k-nearest neighbors 82.97%
stochastic gradient descent 83.41%
Multinomial naïve Bayes 85.15%
Support vector machine 85.59%
Table 2. Precision of classifiers
Classifier Precision
Random forest 67.01%
Decision tree 69.62%
Logistic regression 79.23%
K-nearest neighbors 79.39%
Stochastic gradient descent 79.55%
Multinomial naïve Bayes 85.96%
Support vector machine 81.68%
Table 3. Recall of classifiers
Classifier Recall
Random forest 96.55%
Decision tree 94.83%
Logistic regression 88.79%
K-nearest neighbors 89.66%
Stochastic gradient descent 90.52%
Multinomial naïve Bayes 84.48%
support vector machine 92.24%
Table 4. F1-score of classifiers
Classifier F1-Score
Random forest 79.15%
Decision tree 80.29%
Logistic regression 83.74%
K-nearest neighbors 84.21%
Stochastic gradient descent 84.68%
Multinomial naïve Bayes 85.22%
Support vector machine 86.64%
Our objective is to predict the mentally hampered individuals with higher precision which was
achieved by random forest, multinomial naïve Bayes, and support vector machine. With remarkable accuracy
support vector machine, multinomial naïve Bayes, and stochastic gradient descent perform well among the
classifiers as shown in Table 5. Support vector machine, multinomial naive Bayes, and random forest all
perform well as individual classifiers, as seen in the tables. Support vector machines work well for the
challenge because our dataset is significantly more condensed, and the labels are poorly understood.
K-nearest neighbor works effectively since there are fewer dimensions or attributes. The assumption of class
conditional independence will only work for a large dataset, which is why the decision tree performs poorly
in this case.
To avoid over fitting and robustness, it is needed to have a strong correlation over fitting nuts,
though it is not exceptional. As it is not robust to noise and does not generalize well, future observed data
decision trees do not work too well. In Figure 9 the overall performance comparison is shown.
8. Int J Elec & Comp Eng ISSN: 2088-8708
Sentiment analysis on Bangla conversation using machine learning approach (Mahmudul Hassan)
5569
Table 5. Performance analysis of different algorithms
Classifier Accuracy Precision Recall F1-Score
Random forest 74.24% 67.01% 96.55% 79.15%
Decision tree 76.42% 69.62% 94.83% 80.29%
Logistic regression 82.53% 79.23% 88.79% 83.74%
k-nearest neighbors 82.97% 79.39% 89.66% 84.21%
Stochastic gradient descent 83.41% 79.55% 90.52% 84.68%
Multinomial naïve Bayes 85.15% 85.96% 84.48% 85.22%
Support vector machine 85.59% 81.68% 92.24% 86.64%
Figure 9. Performance analysis
4.1. Prediction
We have tried to test our model by using a random conversation data and we got a result.
In Figures 10 and 11, We can see positive and negative prediction conversation. That Mean’s, we can see that
our proposed model can extract sentiment from Bangla conversation data.
Figure 10. Predicting positive conversation
Figure 11. Predicting negative conversation
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 5, October 2022: 5562-5572
5570
5. CONCLUSION
This research work concludes with an expected outcome using machine learning approach of
extracting sentiment from Bangla conversation data. Text mining and text analysis are very new terms in
Bangla language. Though it is a tough task to work with some limitations, lacking the resources we tried to
overcome these difficulties. Technology makes the communication sector easier with advancement. But
embracing the advancement by ensuring the control of enormous data is necessary for us. We should be
concerned about these terminologies to make the world of data more accessible and convenient.
6. FUTURE WORK
This research work proposes a methodology that finds the scopes to work with Bangla conversation
data. To accomplish that, machine learning models were trained from Bangla conversation data and able to
extract sentiment from those conversations. There is a scope to apply a deep learning approach in our dataset
to improve efficiency. Here in this work, we extract sentiment as a positive and negative category. But on a
large scale, people’s emotions, and sentiments as individuals like sadness, anger, neutral, happiness, and fear
can also be extracted. For real-time conversation data, converting real-time conversations into text and
analyzing sentiment from these conversations can also be done. However, scope lies in every possible
opportunity. And opportunity revealed innovation and evolutions.
REFERENCES
[1] C. O. Alm, D. Roth, and R. Sproat, “Emotions from text,” in Proceedings of the conference on Human Language Technology and
Empirical Methods in Natural Language Processing, 2005, pp. 579–586, doi: 10.3115/1220575.1220648.
[2] C. Lin and Y. He, “Joint sentiment/topic model for sentiment analysis,” in Proceeding of the 18th ACM conference on
Information and knowledge management, 2009, 375, doi: 10.1145/1645953.1646003.
[3] T. Nasukawa and J. Yi, “Sentiment analysis: capturing favorability using natural language processing,” in Proceedings of the 2nd
International Conference on Knowledge Capture, K-CAP 2003, 2003, pp. 70–77, doi: 10.1145/945645.945658.
[4] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011,
vol. 1, pp. 142–150.
[5] N. R. Bhowmik, M. Arifuzzaman, and M. R. H. Mondal, “Sentiment analysis on Bangla text using extended lexicon dictionary
and deep learning algorithms,” Array, vol. 13, Mar. 2022, doi: 10.1016/j.array.2021.100123.
[6] T. T. Aurpa, R. Sadik, and M. S. Ahmed, “Abusive Bangla comments detection on Facebook using transformer-based deep
learning models,” Social Network Analysis and Mining, vol. 12, no. 1, Dec. 2022, doi: 10.1007/s13278-021-00852-x.
[7] M. R. H. K. Rahib, A. H. Tamim, M. Z. Tahmeed, and M. J. Hossain, “Emotion detection based on Bangladeshi people’s social
media response on COVID-19,” SN Computer Science, vol. 3, no. 2, Mar. 2022, doi: 10.1007/s42979-022-01077-1.
[8] M. A. Faruque, S. Rahman, P. Chakraborty, T. Choudhury, J.-S. Um, and T. P. Singh, “Ascertaining polarity of public opinions
on Bangladesh cricket using machine learning techniques,” Spatial Information Research, vol. 30, no. 1, pp. 1–8, Feb. 2022, doi:
10.1007/s41324-021-00403-8.
[9] M. D. A. Iqbal, A. Das, O. Sharif, M. M. Hoque, and I. H. Sarker, “BEmoC: a corpus for identifying emotion in Bengali texts,”
SN Computer Science, vol. 3, no. 2, Mar. 2022, doi: 10.1007/s42979-022-01028-w.
[10] S. F. Shetu, M. Saifuzzaman, M. Parvin, N. N. Moon, R. Yousuf, and S. Sultana, “Identifying the writing style of bangla language
using natural language processing,” in 2020 11th International Conference on Computing, Communication and Networking
Technologies (ICCCNT), Jul. 2020, pp. 1–6, doi: 10.1109/ICCCNT49239.2020.9225670.
[11] M. M. R. Mamun, O. Sharif, and M. M. Hoque, “Classification of textual sentiment using ensemble technique,” SN Computer
Science, vol. 3, no. 1, Jan. 2022, doi: 10.1007/s42979-021-00922-z.
[12] M. S. Neethu and R. Rajasree, “Sentiment analysis in Twitter using machine learning techniques,” in 2013 Fourth International
Conference on Computing, Communications and Networking Technologies (ICCCNT), 2013, pp. 1–5, doi:
10.1109/ICCCNT.2013.6726818.
[13] A. P. Jain and P. Dandannavar, “Application of machine learning techniques to sentiment analysis,” in 2016 2nd International
Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), 2016, pp. 628–632, doi:
10.1109/ICATCCT.2016.7912076.
[14] M. Rahman and E. K. Dey, “Datasets for aspect-based sentiment analysis in bangla and its baseline evaluation,” Data, vol. 3,
no. 2, May 2018, doi: 10.3390/data3020015.
[15] M. Saifuzzaman, S. F. Shetu, N. N. Moon, F. N. Nur, and M. H. Ali, “IoT based street lighting using dual axis solar tracker and
effective traffic management system using deep learning: bangladesh context,” in 2020 11th International Conference on
Computing, Communication and Networking Technologies (ICCCNT), Jul. 2020, pp. 1–5, doi:
10.1109/ICCCNT49239.2020.9225590.
[16] M. Saifuzzaman, N. N. Moon, and F. N. Nur, “IoT based street lighting and traffic management system,” in 2017 IEEE Region 10
Humanitarian Technology Conference (R10-HTC), Dec. 2017, pp. 121–124, doi: 10.1109/R10-HTC.2017.8288921.
[17] R. Hasan, S. Islam, M. H. Rahman, M. Saifuzzaman, S. F. Shetu, and N. N. Moon, “Implementation of low cost real-time
attendance management system: a comparative study,” in 2020 8th International Conference on Reliability, Infocom Technologies
and Optimization (Trends and Future Directions) (ICRITO), Jun. 2020, pp. 1098–1101, doi:
10.1109/ICRITO48877.2020.9197764.
[18] S. F. Shetu, M. Saifuzzaman, N. N. Moon, and F. N. Nur, “A survey of botnet in cyber security,” in 2019 2nd International
Conference on Intelligent Communication and Computational Techniques (ICCT), Sep. 2019, pp. 174–177., doi:
10.1109/ICCT46177.2019.8969048.
[19] K. K. Podder et al., “Bangla sign language (BdSL) alphabets and numerals classification using a deep learning model,” Sensors,
10. Int J Elec & Comp Eng ISSN: 2088-8708
Sentiment analysis on Bangla conversation using machine learning approach (Mahmudul Hassan)
5571
vol. 22, no. 2, Jan. 2022, doi: 10.3390/s22020574.
[20] M. Hossain et al., “Prediction on domestic violence in Bangladesh during the COVID-19 outbreak using machine learning
methods,” Applied System Innovation, vol. 4, no. 4, Oct. 2021, doi: 10.3390/asi4040077.
[21] M. Al-Smadi, O. Qawasmeh, B. Talafha, and M. Quwaider, “Human annotated Arabic dataset of book reviews for aspect based
sentiment analysis,” in 2015 3rd International Conference on Future Internet of Things and Cloud, Aug. 2015, pp. 726–730, doi:
10.1109/FiCloud.2015.62.
[22] L. Khan, A. Amjad, K. M. Afaq, and H.-T. Chang, “Deep sentiment analysis using CNN-LSTM architecture of English and
Roman Urdu text shared in social media,” Applied Sciences, vol. 12, no. 5, Mar. 2022, doi: 10.3390/app12052694.
[23] M. Chen, K. Ubul, X. Xu, A. Aysa, and M. Muhammat, “Connecting text classification with image classification: a new
preprocessing method for implicit sentiment text classification,” Sensors, vol. 22, no. 5, Feb. 2022, doi: 10.3390/s22051899.
[24] K. Schouten, O. van der Weijde, F. Frasincar, and R. Dekker, “Supervised and unsupervised aspect category detection for
sentiment analysis with co-occurrence data,” IEEE Transactions on Cybernetics, vol. 48, no. 4, pp. 1263–1275, Apr. 2018, doi:
10.1109/TCYB.2017.2688801.
[25] H. Zou and K. Xiang, “Sentiment classification method based on blending of emoticons and short texts,” Entropy, vol. 24, no. 3,
Mar. 2022, doi: 10.3390/e24030398.
BIOGRAPHIES OF AUTHORS
Mahmudul Hassan studied Computer Science and Engineering from
Daffodil International University. His main interests in research fields are natural
language processing, image processing, machine learning and data mining. His research
interests also include data science and computer vision. He can be contacted at email:
mahmudul15-8991@diu.edu.bd.
Shahriar Shakil studied Computer Science and Engineering from Daffodil
International University. His main interests in research fields are image processing,
machine learning and data mining. His research interests also include data science and
computer vision. He can be contacted at email: shahriar15-8558@diu.edu.bd.
Nazmun Nessa Moon is an associate professor of the Department of
Computer Science and Engineering at Daffodil International University. She received the
B.Sc. degree in Computer Science and Engineering from Rajshahi University of
Engineering and Technology and M.Sc. in Information and Communication Technology
from Bangladesh University of Engineering and Technology (BUET). Her interested
research fields are IoT, digital image processing and machine learning. She can be
contacted at email: moon@daffodilvarsity.edu.bd.
Mohammad Monirul Islam is a lecturer (senior scale) of the Department
of Computer Science and Engineering at Daffodil International University. He completed
his B.Sc. (India) in Computer Science and M.Sc. (UK) in Computing. His areas of
interests are navigation for organizations/institutions and data warehousing. He can be
contacted at email: monirul@daffodilvarsity.edu.bd.
11. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 5, October 2022: 5562-5572
5572
Refath Ara Hossain is a lecturer of the Department of Computer Science
and Engineering at Daffodil International University. Her interested research fields are
data mining, digital image processing and machine learning. She can be contacted at
email: refath.cse@diu.edu.bd.
Asma Mariam is a lecturer of the Department of Computer Science and
Engineering at Daffodil International University. Her interested research fields are digital
image processing, data mining and machine learning. She can be contacted at email:
asma.cse@fiu.edu.bd.
Fernaz Narin Nur is an associate professor at the Notre Dame University
Bangladesh (NDUB) in the Department of CSE. She is a passionate researcher in the
fields of wireless sensor network, cloud computing, internet of things, and performance
analysis. She can be contacted at email: fernazcse@ndub.edu.bd.