In this era of technology, online communication has expanded to incorporate different language combinations and expressive styles. Code-mixing, or combining languages such as Bengali and English, is frequent in multilingual environments. In addition, the use of emojis, which are small graphical icons, has increased the complexity of online communication by allowing for more emotional expression. Sentiment analysis of comments on social media that are combined with emojis and written in Bangla-English language is the main topic of this work. This research aims to understand and classify the sentiments expressed in code-mixed comments, including the nuanced role of emojis. The data was preprocessed first, then feature extraction was performed using the TF-IDF Vectorizer and CountVectorizer algorithms. For the analysis, nine different machine learning algorithms were used. With a remarkable accuracy of 85.7% and an F1 score of 85.0%, the Support Vector Classifier stood out as the most successful model. This highlights the utility of including emoji-based features for complex sentiment analysis, particularly when dealing with code-mixed data. The dataset contained 2055 comments from Facebook pages, including Bangla-English comments with and without emojis and comments that only contained emojis. To prepare the dataset for analysis, preprocessing methods included removing irrelevant data and converting emojis into Unicode short names.
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTSijnlc
The evolution of information Technology has led to the collection of large amount of data, the volume of
which has increased to the extent that in last two years the data produced is greater than all the data ever
recorded in human history. This has necessitated use of machines to understand, interpret and apply data,
without manual involvement. A lot of these texts are available in transliterated code-mixed form, which due
to the complexity are very difficult to analyze. The work already performed in this area is progressing at
great pace and this work hopes to be a way to push that work further. The designed system is an effort
which classifies Hindi as well as Marathi text transliterated (Romanized) documents automatically using
supervised learning methods (KNN), Naïve Bayes and Support Vector Machine (SVM)) and ontology based
classification; and results are compared to in order to decide which methodology is better suited in
handling of these documents. As we will see, the plain machine learning algorithm applications are just as
or in many cases are much better in performance than the more analytical approach.
Sentiment analysis on Bangla conversation using machine learning approachIJECEIAES
Nowadays, online communication is more convenient and popular than faceto-face conversation. Therefore, people prefer online communication over face-to-face meetings. Enormous people use online chatting systems to speak with their loved ones at any given time throughout the world. People create massive quantities of conversation every second because of their online engagement. People's feelings during the conversation period can be gleaned as useful information from these conversations. Text analysis and conclusion of any material as summarization can be done using sentiment analysis by natural language processing. The use of communication for customer service portals in various e-commerce platforms and crime investigations based on digital evidence is increasing the need for sentiment analysis of a conversation. Other languages, such as English, have welldeveloped libraries and resources for natural language processing, yet there are few studies conducted on Bangla. It is more challenging to extract sentiments from Bangla conversational data due to the language's grammatical complexity. As a result, it opens vast study opportunities. So, support vector machine, multinomial naïve Bayes, k-nearest neighbors, logistic regression, decision tree, and random forest was used. From the dataset, extracted information was labeled as positive and negative.
Functional English Design for Domestic Migrant Workersidhasaeful
This paper aimed at: (1) describing the content of Functional English Design (FED) materials and (2) describing the appropriateness of the FEDas the English training materials for the migrant workers' candidates (MWC). This study used ADDIE (Analysing, Designing, Developing, Implementing and Evaluating) model involving totally 200 MWC in the 4 PPTKIS (namely authorized private boards in which duties serves the Indonesian workers' placement and protection abroad).The data were taken from the documentation, the trainees’ English training achievements using the FED and peer-debriefing. The gathered data was analyzed using: Content Analysis and Mean-difference computation of the trainees' test results descriptively. This study found: (1) the content of the FEDthatdeveloped“Imparting and seeking factual information” with “Minimum–adequate language Functions” was matched with the trainees needs and (2) the FED was appropriate to use as an alternative English materials since it was designed based on the result of needs analysis beside the test result in significant improvement i.e. the Mean Difference of the oral pre and post-test was 2.25 within the scoring standard scale of 0-10, while the Md of the written pre-post-test was 13.35 within the scoring standard scale of 0-100. Besides, the peers debriefing stated that the FED was recommended for use in the 4 investigated PPTKIS.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
Cyberbullying is currently one of the most important research fields. The majority of researchers have contributed to research on bully text identification in English texts or comments, due to the scarcity of data; analyzing Tamil textstemming is frequently a tedious job. Tamil is a morphologically diverse and agglutinative language. The creation of a Tamil stemmer is not an easy undertaking. After examining the major difficulties encountered, proposed the rule-based iterative preprocessing algorithm (RBIPA). In this attempt, Tamil morphemes and lemmas were extracted using the suffix stripping technique and a supervised machine learning algorithm for classify the word based for pronouns and proper nouns. The novelty of proposed system is developing a preprocessing algorithm for iterative stemming; lemmatize process to discovering exact words from the Tamil Language comments. RBIPA shows 84.96% of accuracy in the given Test Dataset which hasa total of 13000 words.
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
Cyberbullying is currently one of the most important research fields. The majority of researchers have contributed to research on bully text identification in English texts or comments, due to the scarcity of data; analyzing Tamil textstemming is frequently a tedious job. Tamil is a morphologically diverse and agglutinative language. The creation of a Tamil stemmer is not an easy undertaking. After examining the major difficulties encountered, proposed the rule-based iterative preprocessing algorithm (RBIPA). In this attempt, Tamil morphemes and lemmas were extracted using the suffix stripping technique and a supervised machine learning algorithm for classify the word based for pronouns and proper nouns. The novelty of proposed system is developing a preprocessing algorithm for iterative stemming; lemmatize process to discovering exact words from the Tamil Language comments. RBIPA shows 84.96% of accuracy in the given Test Dataset which hasa total of 13000 words.
Hate speech detection on Indonesian text using word embedding method-global v...IAESIJAI
Hate speech is defined as communication directed toward a specific individual or group that involves hatred or anger and a language with solid arguments leading to someone's opinion can cause social conflict. It has a lot of potential for individuals to communicate their thoughts on an online platform because the number of Internet users globally, including in Indonesia, is continually rising. This study aims to observe the impact of pre-trained global vector (GloVe) word embedding on accuracy in the classification of hate speech and non-hate speech. The use of pre-trained GloVe (Indonesian text) and single and multi-layer long short-term memory (LSTM) classifiers has performance that is resistant to overfitting compared to pre-trainable embedding for hate-speech detection. The accuracy value is 81.5% on a single layer and 80.9% on a double-layer LSTM. The following job is to provide pre-trained with formal and non-formal language corpus; pre-processing to overcome non-formal words is very challenging.
An Android Communication Platform between Hearing Impaired and General PeopleAfif Bin Kamrul
This document proposes developing an Android application to facilitate communication between hearing impaired and general people in Bangladesh through Bangla speech-to-sign language and sign language-to-text translation. It aims to recognize Bangla speech and convert it to animated sign language displays and develop a sign language keyboard to type Bangla text. The methodology involves using speech recognition APIs to convert speech to text, tagging parts of speech, looking up signs from an animated database, and displaying them sequentially via a virtual agent. It will also design a keyboard with buttons for Bangla characters and their corresponding signs.
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTSijnlc
The evolution of information Technology has led to the collection of large amount of data, the volume of
which has increased to the extent that in last two years the data produced is greater than all the data ever
recorded in human history. This has necessitated use of machines to understand, interpret and apply data,
without manual involvement. A lot of these texts are available in transliterated code-mixed form, which due
to the complexity are very difficult to analyze. The work already performed in this area is progressing at
great pace and this work hopes to be a way to push that work further. The designed system is an effort
which classifies Hindi as well as Marathi text transliterated (Romanized) documents automatically using
supervised learning methods (KNN), Naïve Bayes and Support Vector Machine (SVM)) and ontology based
classification; and results are compared to in order to decide which methodology is better suited in
handling of these documents. As we will see, the plain machine learning algorithm applications are just as
or in many cases are much better in performance than the more analytical approach.
Sentiment analysis on Bangla conversation using machine learning approachIJECEIAES
Nowadays, online communication is more convenient and popular than faceto-face conversation. Therefore, people prefer online communication over face-to-face meetings. Enormous people use online chatting systems to speak with their loved ones at any given time throughout the world. People create massive quantities of conversation every second because of their online engagement. People's feelings during the conversation period can be gleaned as useful information from these conversations. Text analysis and conclusion of any material as summarization can be done using sentiment analysis by natural language processing. The use of communication for customer service portals in various e-commerce platforms and crime investigations based on digital evidence is increasing the need for sentiment analysis of a conversation. Other languages, such as English, have welldeveloped libraries and resources for natural language processing, yet there are few studies conducted on Bangla. It is more challenging to extract sentiments from Bangla conversational data due to the language's grammatical complexity. As a result, it opens vast study opportunities. So, support vector machine, multinomial naïve Bayes, k-nearest neighbors, logistic regression, decision tree, and random forest was used. From the dataset, extracted information was labeled as positive and negative.
Functional English Design for Domestic Migrant Workersidhasaeful
This paper aimed at: (1) describing the content of Functional English Design (FED) materials and (2) describing the appropriateness of the FEDas the English training materials for the migrant workers' candidates (MWC). This study used ADDIE (Analysing, Designing, Developing, Implementing and Evaluating) model involving totally 200 MWC in the 4 PPTKIS (namely authorized private boards in which duties serves the Indonesian workers' placement and protection abroad).The data were taken from the documentation, the trainees’ English training achievements using the FED and peer-debriefing. The gathered data was analyzed using: Content Analysis and Mean-difference computation of the trainees' test results descriptively. This study found: (1) the content of the FEDthatdeveloped“Imparting and seeking factual information” with “Minimum–adequate language Functions” was matched with the trainees needs and (2) the FED was appropriate to use as an alternative English materials since it was designed based on the result of needs analysis beside the test result in significant improvement i.e. the Mean Difference of the oral pre and post-test was 2.25 within the scoring standard scale of 0-10, while the Md of the written pre-post-test was 13.35 within the scoring standard scale of 0-100. Besides, the peers debriefing stated that the FED was recommended for use in the 4 investigated PPTKIS.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
Cyberbullying is currently one of the most important research fields. The majority of researchers have contributed to research on bully text identification in English texts or comments, due to the scarcity of data; analyzing Tamil textstemming is frequently a tedious job. Tamil is a morphologically diverse and agglutinative language. The creation of a Tamil stemmer is not an easy undertaking. After examining the major difficulties encountered, proposed the rule-based iterative preprocessing algorithm (RBIPA). In this attempt, Tamil morphemes and lemmas were extracted using the suffix stripping technique and a supervised machine learning algorithm for classify the word based for pronouns and proper nouns. The novelty of proposed system is developing a preprocessing algorithm for iterative stemming; lemmatize process to discovering exact words from the Tamil Language comments. RBIPA shows 84.96% of accuracy in the given Test Dataset which hasa total of 13000 words.
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
Cyberbullying is currently one of the most important research fields. The majority of researchers have contributed to research on bully text identification in English texts or comments, due to the scarcity of data; analyzing Tamil textstemming is frequently a tedious job. Tamil is a morphologically diverse and agglutinative language. The creation of a Tamil stemmer is not an easy undertaking. After examining the major difficulties encountered, proposed the rule-based iterative preprocessing algorithm (RBIPA). In this attempt, Tamil morphemes and lemmas were extracted using the suffix stripping technique and a supervised machine learning algorithm for classify the word based for pronouns and proper nouns. The novelty of proposed system is developing a preprocessing algorithm for iterative stemming; lemmatize process to discovering exact words from the Tamil Language comments. RBIPA shows 84.96% of accuracy in the given Test Dataset which hasa total of 13000 words.
Hate speech detection on Indonesian text using word embedding method-global v...IAESIJAI
Hate speech is defined as communication directed toward a specific individual or group that involves hatred or anger and a language with solid arguments leading to someone's opinion can cause social conflict. It has a lot of potential for individuals to communicate their thoughts on an online platform because the number of Internet users globally, including in Indonesia, is continually rising. This study aims to observe the impact of pre-trained global vector (GloVe) word embedding on accuracy in the classification of hate speech and non-hate speech. The use of pre-trained GloVe (Indonesian text) and single and multi-layer long short-term memory (LSTM) classifiers has performance that is resistant to overfitting compared to pre-trainable embedding for hate-speech detection. The accuracy value is 81.5% on a single layer and 80.9% on a double-layer LSTM. The following job is to provide pre-trained with formal and non-formal language corpus; pre-processing to overcome non-formal words is very challenging.
An Android Communication Platform between Hearing Impaired and General PeopleAfif Bin Kamrul
This document proposes developing an Android application to facilitate communication between hearing impaired and general people in Bangladesh through Bangla speech-to-sign language and sign language-to-text translation. It aims to recognize Bangla speech and convert it to animated sign language displays and develop a sign language keyboard to type Bangla text. The methodology involves using speech recognition APIs to convert speech to text, tagging parts of speech, looking up signs from an animated database, and displaying them sequentially via a virtual agent. It will also design a keyboard with buttons for Bangla characters and their corresponding signs.
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...IJECEIAES
This document summarizes a research paper that proposes using a tree-based pipeline optimization tool (TPOT) to improve sentiment classification of dialectal Arabic texts. The paper provides background on sentiment analysis and challenges in analyzing informal Arabic texts. It then discusses related work applying TPOT and AutoML techniques to optimize machine learning for various tasks. The proposed approach uses TPOT for sentiment analysis of three Arabic dialect datasets to automatically optimize hyperparameters and improve over similar prior work.
An evolutionary approach to comparative analysis of detecting Bangla abusive ...journalBEEI
The use of Bangla abusive texts has been accelerated with the progressive use of social media. Through this platform, one can spread the hatred or negativity in a viral form. Plenty of research has been done on detecting abusive text in the English language. Bangla abusive text detection has not been done to a great extent. In this experimental study, we have applied three distinct approaches to a comprehensive dataset to obtain a better outcome. In the first study, a large dataset collected from Facebook and YouTube has been utilized to detect abusive texts. After extensive pre-processing and feature extraction, a set of consciously selected supervised machine learning classifiers i.e. multinomial Naïve Bayes (MNB), multi layer perceptron (MLP), support vector machine (SVM), decision tree, random forrest, stochastic gradient descent (SGD), ridge, perceptron and k-nearest neighbors (k-NN) has been applied to determine the best result. The second experiment is conducted by constructing a balanced dataset by random under sampling the majority class and finally, a Bengali stemmer is employed on the dataset and then the final experiment is conducted. In all three experiments, SVM with the full dataset obtained the highest accuracy of 88%.
Identification of monolingual and code-switch information from English-Kannad...IJECEIAES
Code-switching is a very common occurrence in social media communication, predominantly found in multilingual countries like India. Using more than one language in communication is known as codeswitching or code-mixing. Some of the important applications of codeswitch are machine translation (MT), shallow parsing, dialog systems, and semantic parsing. Identifying code-switch and monolingual information is useful for better communication in online networking websites. In this paper, we performed a character level n-gram approach to identify monolingual and code-switch information from English-Kannada social media data. We paralleled various machine learning techniques such as naïve Bayes (NB), support vector classifier (SVC), logistic regression (LR) and neural network (NN) on English-Kannada code-switch (EKCS) data. From the proposed approach, it is observed that the character level n-gram approach provides 1.8% to 4.1% of improvement in terms of Accuracy and 1.6% to 3.8% of improvement in F1-score. Also observed that SVC and NN techniques are outperformed in terms of accuracy (97.9%) and F1-score (98%) with character level n-gram.
Indonesian multilabel classification using IndoBERT embedding and MBERT class...IJECEIAES
The rapid increase in social media activity has triggered various discussion spaces and information exchanges on social media. Social media users can easily tell stories or comment on many things without limits. However, this often triggers open debates that lead to fights on social media. This is because many social media users use toxic comments that contain elements of racism, radicalism, pornography, or slander to argue and corner individuals or groups. These comments can easily spread and trigger users vulnerable to mental disorders due to unhealthy and unfair debates on social media. Thus, a model is needed to classify comments, especially toxic ones, in Indonesian. Transformer-based model development and natural language processing approaches can be applied to create classification models. Some previous research related to the classification of toxic comments has been done, but the classification results of the model still require exploration to get optimal results. So, this research uses the proposed model by using different pre-trained models at the embedding and classification stages, in the embedding stage using Indonesia bidirectional encoder representations from transformers (IndoBERT), and classification using multilingual bidirectional encoder representations from transformers (MBERT). The proposed model provides optimal results with an F1 value of 0.9032.
Hate Speech Recognition System through NLP and Deep LearningIRJET Journal
The document describes a proposed system for recognizing hate speech through natural language processing and deep learning techniques. It discusses how hate speech on social media platforms is a growing problem. The proposed system uses techniques like TF-IDF, entropy estimation, and a fuzzy artificial neural network for hate speech recognition. The system preprocesses text data by removing special symbols, applying stemming, and removing stop words. It then classifies text as hate speech or not hate speech using the natural language processing and deep learning models. The authors conducted experiments that showed the system achieved highly positive results in hate speech recognition performance.
A Subjective Feature Extraction For Sentiment Analysis In Malayalam LanguageJeff Nelson
The document discusses sentiment analysis of Malayalam film reviews using machine learning techniques. It proposes using Conditional Random Fields combined with rule-based approaches for sentiment analysis at the sentence and document level in Malayalam. The system is trained on a manually tagged corpus of over 30,000 tokens and tested on film reviews to determine the overall polarity (positive, negative, neutral) and rating of individual categories like film, direction, acting etc. The system achieved an accuracy of 82% in identifying sentiment and ratings.
Design Analysis Rules to Identify Proper Noun from Bengali Sentence for Univ...Syeful Islam
Abstract—Now-a-days hundreds of millions of people of
almost all levels of education and attitudes from different
country communicate with each other for different
purposes and perform their jobs on internet or other
communication medium using various languages. Not all
people know all language; therefore it is very difficult to
communicate or works on various languages. In this
situation the computer scientist introduce various inter
language translation program (Machine translation). UNL
is such kind of inter language translation program. One of
the major problem of UNL is identified a name from a
sentence, which is relatively simple in English language,
because such entities start with a capital letter. In Bangla
we do not have concept of small or capital letters. Thus
we find difficulties in understanding whether a word is a
proper noun or not. Here we have proposed analysis rules
to identify proper noun from a sentence and established
post converter which translate the name entity from
Bangla to UNL. The goal is to make possible Bangla
sentence conversion to UNL and vice versa. UNL system
prove that the theoretical analysis of our proposed system
able to identify proper noun from Bangla sentence and
produce relative Universal word for UNL.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESLinda Garcia
This document summarizes several existing grammar checkers for various natural languages. It discusses rule-based, statistical, and hybrid approaches to grammar checking. Grammar checkers described include those for Afan Oromo, Amharic, Swedish, Icelandic, Nepali, and Portuguese. The document analyzes the approaches, methodologies, advantages, and limitations of each grammar checker.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
This document summarizes and reviews various grammar checkers for natural languages. It begins by defining key concepts in natural language processing like computational linguistics and grammar checking. It then describes the general working of grammar checkers, which involves preprocessing text, analyzing morphology and syntax, and identifying grammatical errors. The document surveys grammar checking approaches for several languages like rule-based, statistical, and hybrid methods. Specific grammar checkers are discussed for languages like Afan Oromo, Amharic, Swedish, Icelandic, Nepali, and Portuguese. The review concludes by analyzing the features and limitations of existing grammar checking systems.
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from different country communicate with each other for different purposes using various languages. Machine translation is highly demanding due to increasing the usage of web based Communication. One of the major problem of Bengali translation is identified a naming word from a sentence, which is relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small or capital letters and there is huge no. of different naming entity available in Bangla. Thus we find difficulties in understanding whether a word is a naming word or not. Here we have introduced a new approach to identify naming word from a Bengali sentence for machine translation system without storing huge no. of naming entity in word dictionary. The goal is to make possible Bangla sentence conversion with minimal storing word in dictionary.
Hate Speech Detection in multilingual Text using Deep LearningIRJET Journal
This paper aims to detect hate speech in multilingual text using deep learning techniques. Specifically, it focuses on English-Hindi code-mixed text from social media. The paper combines three existing datasets to create a larger consolidated dataset of over 20,000 tweets and comments annotated as hate speech or non-hate speech. It then applies and compares various machine learning and deep learning models on this dataset. The experimental results show that a CNN-BiLSTM deep learning model achieves the best performance with 87% accuracy, 82% precision, and 85% F1 score, outperforming existing approaches.
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from different country communicate with
each other for different using various languages. Machine translation is highly demanding due to increasing the usage of web
based Communication. One of the major problem of Bengali translation is identified a naming word from a sentence, which is
relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small
or capital letters and there is huge no. of different naming entity available in Bangla. Thus we find difficulties in understanding
whether a word is a proper noun or not. Here we have introduce a new approach to identify proper noun from a Bengali sentence
for UNL without storing huge no. of naming entity in word dictionary. The goal is to make possible Bangla sentence conversion
to UNL and vice versa with minimal storing word in dictionary.
Dictionary Entries for Bangla Consonant Ended Roots in Universal Networking L...Waqas Tariq
The Universal Networking Language (UNL) deals with the communication across nations of different languages and involves with many different related discipline such as linguistics, epistemology, computer science etc. It helps to overcome the language barrier among people of different nations to solve problems emerging from current globalization trends and geopolitical interdependence. We are working to include Bangla language in the UNL system so that Bangla language can be converted to UNL expressions. As a part of this process currently we are working on Bangla Consonant Ended Verb Roots and trying to develop lexical or dictionary entries for the Consonant Ended Verb Roots. In this paper, we have presented our work by describing Bangla verb, Verb root, Verbal Inflections and then finally showed the dictionary entries for the consonant ended roots.
A Review Of Text Mining Techniques And ApplicationsLisa Graves
This document provides a review of various text mining techniques and applications. It discusses techniques used for text classification and summarization, including Naive Bayes classification, backpropagation neural networks, keyword matching, and information extraction. It also covers applications of text mining in areas like sentiment analysis of social media posts and hotel reviews. Finally, it discusses the need for organizational text mining to extract useful information and insights from large amounts of unstructured text data.
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers.
NBLex: emotion prediction in Kannada-English code-switchtext using naïve baye...IJECEIAES
This document summarizes a research paper that proposes a lexicon-based method called NBLex to predict emotions in Kannada-English code-switch text. The paper describes creating a Kannada-English code-switch corpus by collecting comments from YouTube and annotating them. A Naive Bayes classifier is used to build the NBLex lexicon and predict emotions. Evaluation shows the NBLex method achieves 87.9% accuracy for emotion prediction, outperforming Naive Bayes and BiLSTM baselines. The paper concludes that a simple additive lexicon approach can be an alternative to machine learning for emotion prediction in code-switch text.
This study explores the correlation between technology utilization and language acquisition while analyzing the impact of moderating variables on this relation. Our meta-analysis approach analyzes data from 43 extracts out of 19 primary studies published between 2012 and 2021. Our data analysis employs a random-effect model utilizing a significance level of α = 0.05. Additionally, the authors examine four moderating variables: level of education, location of research, proficiency in language, and year of publication. Technology-based language acquisition outperforms traditional methods, indicating a significant and moderate impact on the learning process. This study enhances comprehension of the efficacy of technology in language acquisition by identifying various factors, such as the geographical location of research, methods of assessing language proficiency, and technology type employed. However, there is insufficient evidence to support the notion that educational level or sample size significantly impact technology-based language acquisition. This meta-analysis highlights the importance of considering nuanced factors when integrating technology into language learning. The findings emphasize the possibility of technology to transform methods of acquiring language and urge additional investigation into customized strategies that optimize its advantages.
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from
different country communicate with each other for different using various languages. Machine
translation is highly demanding due to increasing the usage of web based Communication. One of
the major problem of Bengali translation is identified a naming word from a sentence, which is
relatively simple in English language, because such entities start with a capital letter. In Bangla we
do not have concept of small or capital letters and there is huge no. of different naming entity
available in Bangla. Thus we find difficulties in understanding whether a word is a naming word
(proper noun) or not. Here we have introduced a new approach to identify naming word from a
Bengali sentence for UNL without storing huge no. of naming entity in word dictionary. The goal is
to make possible Bangla sentence conversion to UNL and vice versa with minimal storing word in
dictionary.
USING STRUCTURAL EQUATION MODELING ON FOREIGN DIRECT INVESTMENT OF INDIAN ECO...indexPub
Purpose: Foreign direct investment (FDI) altogether influences the beneficiary country's financial development, making it more stable, high-quality, and healthy, according to this empirical study based on the present stage of economic development. Thus, every country encountering financial globalization is attempting to lay out a serious business climate to increment worldwide speculation. Design/Methodology/Approach: the main objective of this study is based on Institutional quality or Evidence and I selected 5 factors Institutional Metrics like Voice and Accountability, Civil liberties, Women in parliament, Corruption perceptions, Political rights from DPIIT website (Secondary Data) for the period 2018-2023. Static analysis methods such as the Unit Root Test, the ARDL Approach, and SEM are being used. Originality/Value: The experts in this study used OLS (Least Squares) regression: Foreign direct investment (FDI) streams were the focal point of the exploration. The impact of institutional qualities on unfamiliar direct speculation streams has been explored utilizing the customary least square methodology. Findings: Institutional metrics of government efficacy and corruption have shown a shortrun link with foreign direct investment (FDI) flows, according to the research, which used the ARDL model to find that these indicators had positive coefficient values. As far as institutional markers like law and order, administrative quality, and voice and responsibility, the review found that political stability had a long-term association with foreign direct investment flows (7.4578 > 4.16), placing it above the upper peasant table.
SUSTAINABLE INVESTING UNVEILED: THE ROLE OF BOND RATINGS IN GUIDING GREEN BON...indexPub
The increasing urgency to address climate change has propelled sustainable investing into the spotlight, with green bonds emerging as a pivotal instrument for mobilizing the capital required for environmental projects. This study delves into the critical role that bond ratings play in guiding investments in green bonds, shedding light on how these ratings influence investor confidence and the allocation of funds towards sustainable initiatives. By employing a mixed-methods approach, combining quantitative analysis of green bond performance with qualitative interviews from industry experts, this research offers a comprehensive overview of the interplay between bond ratings and green bond investments. The findings suggest that higher bond ratings, often indicative of lower risk and better sustainability credentials, significantly impact the attractiveness of green bonds to investors. Additionally, the study examines the evolution of rating criteria to encompass environmental, social, and governance (ESG) factors, highlighting the shift towards more holistic assessments of investment risk and potential. This research contributes to the broader discourse on sustainable finance by providing insights into the mechanisms through which bond ratings can facilitate more informed and impactful green bond investments.
More Related Content
Similar to SENTIMENT ANALYSIS OF BANGLA-ENGLISH CODE-MIXED AND TRANSLITERATED SOCIAL MEDIA COMMENTS USING MACHINE LEARNING
Dialectal Arabic sentiment analysis based on tree-based pipeline optimizatio...IJECEIAES
This document summarizes a research paper that proposes using a tree-based pipeline optimization tool (TPOT) to improve sentiment classification of dialectal Arabic texts. The paper provides background on sentiment analysis and challenges in analyzing informal Arabic texts. It then discusses related work applying TPOT and AutoML techniques to optimize machine learning for various tasks. The proposed approach uses TPOT for sentiment analysis of three Arabic dialect datasets to automatically optimize hyperparameters and improve over similar prior work.
An evolutionary approach to comparative analysis of detecting Bangla abusive ...journalBEEI
The use of Bangla abusive texts has been accelerated with the progressive use of social media. Through this platform, one can spread the hatred or negativity in a viral form. Plenty of research has been done on detecting abusive text in the English language. Bangla abusive text detection has not been done to a great extent. In this experimental study, we have applied three distinct approaches to a comprehensive dataset to obtain a better outcome. In the first study, a large dataset collected from Facebook and YouTube has been utilized to detect abusive texts. After extensive pre-processing and feature extraction, a set of consciously selected supervised machine learning classifiers i.e. multinomial Naïve Bayes (MNB), multi layer perceptron (MLP), support vector machine (SVM), decision tree, random forrest, stochastic gradient descent (SGD), ridge, perceptron and k-nearest neighbors (k-NN) has been applied to determine the best result. The second experiment is conducted by constructing a balanced dataset by random under sampling the majority class and finally, a Bengali stemmer is employed on the dataset and then the final experiment is conducted. In all three experiments, SVM with the full dataset obtained the highest accuracy of 88%.
Identification of monolingual and code-switch information from English-Kannad...IJECEIAES
Code-switching is a very common occurrence in social media communication, predominantly found in multilingual countries like India. Using more than one language in communication is known as codeswitching or code-mixing. Some of the important applications of codeswitch are machine translation (MT), shallow parsing, dialog systems, and semantic parsing. Identifying code-switch and monolingual information is useful for better communication in online networking websites. In this paper, we performed a character level n-gram approach to identify monolingual and code-switch information from English-Kannada social media data. We paralleled various machine learning techniques such as naïve Bayes (NB), support vector classifier (SVC), logistic regression (LR) and neural network (NN) on English-Kannada code-switch (EKCS) data. From the proposed approach, it is observed that the character level n-gram approach provides 1.8% to 4.1% of improvement in terms of Accuracy and 1.6% to 3.8% of improvement in F1-score. Also observed that SVC and NN techniques are outperformed in terms of accuracy (97.9%) and F1-score (98%) with character level n-gram.
Indonesian multilabel classification using IndoBERT embedding and MBERT class...IJECEIAES
The rapid increase in social media activity has triggered various discussion spaces and information exchanges on social media. Social media users can easily tell stories or comment on many things without limits. However, this often triggers open debates that lead to fights on social media. This is because many social media users use toxic comments that contain elements of racism, radicalism, pornography, or slander to argue and corner individuals or groups. These comments can easily spread and trigger users vulnerable to mental disorders due to unhealthy and unfair debates on social media. Thus, a model is needed to classify comments, especially toxic ones, in Indonesian. Transformer-based model development and natural language processing approaches can be applied to create classification models. Some previous research related to the classification of toxic comments has been done, but the classification results of the model still require exploration to get optimal results. So, this research uses the proposed model by using different pre-trained models at the embedding and classification stages, in the embedding stage using Indonesia bidirectional encoder representations from transformers (IndoBERT), and classification using multilingual bidirectional encoder representations from transformers (MBERT). The proposed model provides optimal results with an F1 value of 0.9032.
Hate Speech Recognition System through NLP and Deep LearningIRJET Journal
The document describes a proposed system for recognizing hate speech through natural language processing and deep learning techniques. It discusses how hate speech on social media platforms is a growing problem. The proposed system uses techniques like TF-IDF, entropy estimation, and a fuzzy artificial neural network for hate speech recognition. The system preprocesses text data by removing special symbols, applying stemming, and removing stop words. It then classifies text as hate speech or not hate speech using the natural language processing and deep learning models. The authors conducted experiments that showed the system achieved highly positive results in hate speech recognition performance.
A Subjective Feature Extraction For Sentiment Analysis In Malayalam LanguageJeff Nelson
The document discusses sentiment analysis of Malayalam film reviews using machine learning techniques. It proposes using Conditional Random Fields combined with rule-based approaches for sentiment analysis at the sentence and document level in Malayalam. The system is trained on a manually tagged corpus of over 30,000 tokens and tested on film reviews to determine the overall polarity (positive, negative, neutral) and rating of individual categories like film, direction, acting etc. The system achieved an accuracy of 82% in identifying sentiment and ratings.
Design Analysis Rules to Identify Proper Noun from Bengali Sentence for Univ...Syeful Islam
Abstract—Now-a-days hundreds of millions of people of
almost all levels of education and attitudes from different
country communicate with each other for different
purposes and perform their jobs on internet or other
communication medium using various languages. Not all
people know all language; therefore it is very difficult to
communicate or works on various languages. In this
situation the computer scientist introduce various inter
language translation program (Machine translation). UNL
is such kind of inter language translation program. One of
the major problem of UNL is identified a name from a
sentence, which is relatively simple in English language,
because such entities start with a capital letter. In Bangla
we do not have concept of small or capital letters. Thus
we find difficulties in understanding whether a word is a
proper noun or not. Here we have proposed analysis rules
to identify proper noun from a sentence and established
post converter which translate the name entity from
Bangla to UNL. The goal is to make possible Bangla
sentence conversion to UNL and vice versa. UNL system
prove that the theoretical analysis of our proposed system
able to identify proper noun from Bangla sentence and
produce relative Universal word for UNL.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESLinda Garcia
This document summarizes several existing grammar checkers for various natural languages. It discusses rule-based, statistical, and hybrid approaches to grammar checking. Grammar checkers described include those for Afan Oromo, Amharic, Swedish, Icelandic, Nepali, and Portuguese. The document analyzes the approaches, methodologies, advantages, and limitations of each grammar checker.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
This document summarizes and reviews various grammar checkers for natural languages. It begins by defining key concepts in natural language processing like computational linguistics and grammar checking. It then describes the general working of grammar checkers, which involves preprocessing text, analyzing morphology and syntax, and identifying grammatical errors. The document surveys grammar checking approaches for several languages like rule-based, statistical, and hybrid methods. Specific grammar checkers are discussed for languages like Afan Oromo, Amharic, Swedish, Icelandic, Nepali, and Portuguese. The review concludes by analyzing the features and limitations of existing grammar checking systems.
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from different country communicate with each other for different purposes using various languages. Machine translation is highly demanding due to increasing the usage of web based Communication. One of the major problem of Bengali translation is identified a naming word from a sentence, which is relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small or capital letters and there is huge no. of different naming entity available in Bangla. Thus we find difficulties in understanding whether a word is a naming word or not. Here we have introduced a new approach to identify naming word from a Bengali sentence for machine translation system without storing huge no. of naming entity in word dictionary. The goal is to make possible Bangla sentence conversion with minimal storing word in dictionary.
Hate Speech Detection in multilingual Text using Deep LearningIRJET Journal
This paper aims to detect hate speech in multilingual text using deep learning techniques. Specifically, it focuses on English-Hindi code-mixed text from social media. The paper combines three existing datasets to create a larger consolidated dataset of over 20,000 tweets and comments annotated as hate speech or non-hate speech. It then applies and compares various machine learning and deep learning models on this dataset. The experimental results show that a CNN-BiLSTM deep learning model achieves the best performance with 87% accuracy, 82% precision, and 85% F1 score, outperforming existing approaches.
A New Approach: Automatically Identify Proper Noun from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from different country communicate with
each other for different using various languages. Machine translation is highly demanding due to increasing the usage of web
based Communication. One of the major problem of Bengali translation is identified a naming word from a sentence, which is
relatively simple in English language, because such entities start with a capital letter. In Bangla we do not have concept of small
or capital letters and there is huge no. of different naming entity available in Bangla. Thus we find difficulties in understanding
whether a word is a proper noun or not. Here we have introduce a new approach to identify proper noun from a Bengali sentence
for UNL without storing huge no. of naming entity in word dictionary. The goal is to make possible Bangla sentence conversion
to UNL and vice versa with minimal storing word in dictionary.
Dictionary Entries for Bangla Consonant Ended Roots in Universal Networking L...Waqas Tariq
The Universal Networking Language (UNL) deals with the communication across nations of different languages and involves with many different related discipline such as linguistics, epistemology, computer science etc. It helps to overcome the language barrier among people of different nations to solve problems emerging from current globalization trends and geopolitical interdependence. We are working to include Bangla language in the UNL system so that Bangla language can be converted to UNL expressions. As a part of this process currently we are working on Bangla Consonant Ended Verb Roots and trying to develop lexical or dictionary entries for the Consonant Ended Verb Roots. In this paper, we have presented our work by describing Bangla verb, Verb root, Verbal Inflections and then finally showed the dictionary entries for the consonant ended roots.
A Review Of Text Mining Techniques And ApplicationsLisa Graves
This document provides a review of various text mining techniques and applications. It discusses techniques used for text classification and summarization, including Naive Bayes classification, backpropagation neural networks, keyword matching, and information extraction. It also covers applications of text mining in areas like sentiment analysis of social media posts and hotel reviews. Finally, it discusses the need for organizational text mining to extract useful information and insights from large amounts of unstructured text data.
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers.
NBLex: emotion prediction in Kannada-English code-switchtext using naïve baye...IJECEIAES
This document summarizes a research paper that proposes a lexicon-based method called NBLex to predict emotions in Kannada-English code-switch text. The paper describes creating a Kannada-English code-switch corpus by collecting comments from YouTube and annotating them. A Naive Bayes classifier is used to build the NBLex lexicon and predict emotions. Evaluation shows the NBLex method achieves 87.9% accuracy for emotion prediction, outperforming Naive Bayes and BiLSTM baselines. The paper concludes that a simple additive lexicon approach can be an alternative to machine learning for emotion prediction in code-switch text.
This study explores the correlation between technology utilization and language acquisition while analyzing the impact of moderating variables on this relation. Our meta-analysis approach analyzes data from 43 extracts out of 19 primary studies published between 2012 and 2021. Our data analysis employs a random-effect model utilizing a significance level of α = 0.05. Additionally, the authors examine four moderating variables: level of education, location of research, proficiency in language, and year of publication. Technology-based language acquisition outperforms traditional methods, indicating a significant and moderate impact on the learning process. This study enhances comprehension of the efficacy of technology in language acquisition by identifying various factors, such as the geographical location of research, methods of assessing language proficiency, and technology type employed. However, there is insufficient evidence to support the notion that educational level or sample size significantly impact technology-based language acquisition. This meta-analysis highlights the importance of considering nuanced factors when integrating technology into language learning. The findings emphasize the possibility of technology to transform methods of acquiring language and urge additional investigation into customized strategies that optimize its advantages.
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
More than hundreds of millions of people of almost all levels of education and attitudes from
different country communicate with each other for different using various languages. Machine
translation is highly demanding due to increasing the usage of web based Communication. One of
the major problem of Bengali translation is identified a naming word from a sentence, which is
relatively simple in English language, because such entities start with a capital letter. In Bangla we
do not have concept of small or capital letters and there is huge no. of different naming entity
available in Bangla. Thus we find difficulties in understanding whether a word is a naming word
(proper noun) or not. Here we have introduced a new approach to identify naming word from a
Bengali sentence for UNL without storing huge no. of naming entity in word dictionary. The goal is
to make possible Bangla sentence conversion to UNL and vice versa with minimal storing word in
dictionary.
Similar to SENTIMENT ANALYSIS OF BANGLA-ENGLISH CODE-MIXED AND TRANSLITERATED SOCIAL MEDIA COMMENTS USING MACHINE LEARNING (20)
USING STRUCTURAL EQUATION MODELING ON FOREIGN DIRECT INVESTMENT OF INDIAN ECO...indexPub
Purpose: Foreign direct investment (FDI) altogether influences the beneficiary country's financial development, making it more stable, high-quality, and healthy, according to this empirical study based on the present stage of economic development. Thus, every country encountering financial globalization is attempting to lay out a serious business climate to increment worldwide speculation. Design/Methodology/Approach: the main objective of this study is based on Institutional quality or Evidence and I selected 5 factors Institutional Metrics like Voice and Accountability, Civil liberties, Women in parliament, Corruption perceptions, Political rights from DPIIT website (Secondary Data) for the period 2018-2023. Static analysis methods such as the Unit Root Test, the ARDL Approach, and SEM are being used. Originality/Value: The experts in this study used OLS (Least Squares) regression: Foreign direct investment (FDI) streams were the focal point of the exploration. The impact of institutional qualities on unfamiliar direct speculation streams has been explored utilizing the customary least square methodology. Findings: Institutional metrics of government efficacy and corruption have shown a shortrun link with foreign direct investment (FDI) flows, according to the research, which used the ARDL model to find that these indicators had positive coefficient values. As far as institutional markers like law and order, administrative quality, and voice and responsibility, the review found that political stability had a long-term association with foreign direct investment flows (7.4578 > 4.16), placing it above the upper peasant table.
SUSTAINABLE INVESTING UNVEILED: THE ROLE OF BOND RATINGS IN GUIDING GREEN BON...indexPub
The increasing urgency to address climate change has propelled sustainable investing into the spotlight, with green bonds emerging as a pivotal instrument for mobilizing the capital required for environmental projects. This study delves into the critical role that bond ratings play in guiding investments in green bonds, shedding light on how these ratings influence investor confidence and the allocation of funds towards sustainable initiatives. By employing a mixed-methods approach, combining quantitative analysis of green bond performance with qualitative interviews from industry experts, this research offers a comprehensive overview of the interplay between bond ratings and green bond investments. The findings suggest that higher bond ratings, often indicative of lower risk and better sustainability credentials, significantly impact the attractiveness of green bonds to investors. Additionally, the study examines the evolution of rating criteria to encompass environmental, social, and governance (ESG) factors, highlighting the shift towards more holistic assessments of investment risk and potential. This research contributes to the broader discourse on sustainable finance by providing insights into the mechanisms through which bond ratings can facilitate more informed and impactful green bond investments.
AN EXPLONATORY ANALYSIS OF HR ANALYTICS MODEL OVER BIG DATA PROCESS IMPACT ON...indexPub
By generating pertinent indicators, Human Resource Analytics (HRA) can provide HR personnel with a broader perspective on their contribution to the organization's financial objectives. There is a scarcity of research, however, regarding the impact of HRA on business outcomes, specifically in the context of organisations based in India and Vietnam. Within this particular framework, the current study investigates the impact of HRA big data capabilities on business outcomes. The study also investigates the discrepancy between the actual and perceived levels of big data expertise possessed by human resources analysts in Indian and Vietnamese organisations. The current study constructs a conceptual framework in order to examine the hypotheses formulated for assessing the interconnections between the variables being investigated. Utilising the Capability, Motivation, and Opportunity (CMO) framework, it accomplishes this. The data were collected using a quantitative approach, which entailed integrating the various components of HRA expertise and assessing their influence on business outcomes through the utilisation of big data. A systematic questionnaire was developed and distributed to 230 human resources professionals employed by various organisations located in Ho Chi Minh City, Vietnam, and Hyderabad, India. In addition to HR administrators, users of HRAs comprised the participants. A variety of statistical methods were applied to the data to assess the disparity between HRA's anticipated and realised big data capabilities, as well as the impact of HRA on business outcomes. It appears, based on the data that offering incentives and opportunities to employees with HR analytical skills could result in enhanced performance for the organisation. Research has demonstrated that providing opportunities and incentives to skilled employees is crucial for encouraging the development of their analytical abilities. Possessing these types of analytical abilities significantly influences the outcomes of an organisation.
EVOLUTION OF EXPECTATION VIOLATIONS THEORY (EVT) FROM FACE-TO FACE COMMUNICAT...indexPub
The promotion carried out by Holywings in the religious community is a promotional strategy that carries a big risk considering that the majority of Indonesian people are Muslims where liquor is an unlawful item. This was considered a violation and sparked various reactions in society. In the digital era, data sourced from social media is important data to understand the response of the Indonesian people to the violation of the Holywings issue. The research method used is explorative because marketing communication research based on social media data is still new. The research method is descriptive and quantitative by describing variables such as user volume, reach, conversation trends, top tweets, sentiments, influencers, and communication networks between social media users. The research sample uses the period from 23 June 2022 to 30 June 2022. The results of this research indicate that the issue of the Holywings violation is viral compared to other issues. The scope of the issue of violations at Café Holywings spreads throughout Indonesia and is dominant in DKI Jakarta Province. Even though Bali is the biggest branch of Café Holywings, there was very little reaction to the violation of Café Holywings. This is following the theory of expectation violations that the size of the violation reaction on social media is highly dependent on individual characteristics, context, and reaction.
ANTECEDENT OF ORGANIZATIONAL PERFORMANCE IN PT “XXX” PHARMA, TBKindexPub
The objective of this study is to investigate the influence of information technology capability on knowledge management capability, the influence of information technology capability on organisational performance, the influence of knowledge management capability on organisational performance, and the indirect impact of information technology capability on organisational performance through knowledge management capability in PT. "XXX" Pharma, Tbk. The research is based on the understanding of the interconnectedness between information technology cap. The present study utilised a quantitative approach, employing both explanatory and survey methodologies. The process of data collection involved the distribution of a questionnaire to a sample of 44 respondents, which was selected using the purposive random sampling technique. The data processing was performed using route analysis, utilising the IBM Predictive Analytic Software (PASW) version 24. The empirical evidence suggests a statistically significant and positive correlation between the capability of information technology and the capability of knowledge management. Additionally, there is a positive and significant impact of information technology capability on organisational performance. Furthermore, knowledge management capability is found to have a positive and significant effect on organisational performance. Lastly, it is observed that information technology capability positively and significantly influences organisational performance through its impact on knowledge management capability.
UNRAVELLING THE MENTAL HEALTH LANDSCAPE: EXPLORING DEPRESSION AND ASSOCIATED ...indexPub
Introduction: The prevalence of depression and its correlates in Bangladeshi rural university students have been rarely investigated. We draw a literature review, a cross-sectional study and analysis of the rural students’ depression natures and mechanisms that influence their academic performance and health and well-being. Methods: A cross-sectional research was conducted during the period august 2019 to January 2020 in a university. We employed Beck Depression Inventory scale to collect data from 200 undergraduate and graduate students. Data were analysed using chi-square association test and ordinal logistic regression. Results: We discovered that mild to severe depression affected 60% of rural students [mild (16%), borderline (10%), moderate (12%), severe (11.5%), and extreme (10.5%)]. Family expectations, smoking, bad academic achievement, inability to enroll in a particular program, and inadequate household finances were significant risk factors for depression. When it comes to depression, male students scored noticeably higher than female pupils. The decreased depression was linked to both strong household economics and intellectual achievement. Conclusions: The intricate interactions among the risk factors influence the character and processes of depression in rural students.
IMPACT OF PERFORMANCE MANAGEMENT ON SUCCESSION PLANNINGindexPub
Motivation: HR in an organization faces various challenges in business environment, such as Building Capabilities, Improving Productivity, Building Performance Culture, Talent Management, Succession Planning for Key Leadership and Critical Roles, Developing Accountability and Ownership, Human Capital Management and transforming HR function into developmental Role from the legacy driven HR, etc. Succession Planning is the process of identifying and developing individuals, who have potential to hold the key leadership position in an Organization, whereas Performance Management includes assessing and improving upon the performance of an employee to meet the organizational goals. There are several Management Practices, which are adopted widely in Industry to make a successful Succession Planning. Workforce and Talent Management is one of them. The health of an organization majorly depends on the proper placement of people, which is a combined outcome of Talent Identification, Talent Development and Talent Retention. Performance Management plays a vital role in Talent Identification. It also has an impact on Talent Development and Talent Retention. The key idea of succession planning suggests that the right person to be placed at the right position at the right time. Succession planning is becoming a challenge these days in the corporate world. Organizations are often not found prepared with their successors to occupy the key positions as and when required. The positions are either kept vacant for a substantial period or more than one role is assigned to a single person. Identifying the right talent for the key positions from outside the organization and recruiting them is a much more difficult task at the eleventh hour. This has a significant impact on organizational health and in turn to organizational sustainability. Organizations must last longer than people. Role of organization continues even when the people move out. Employees must superannuate after attaining a certain age. Also, organizations must have a contingency plan for sudden vacancy arises out of attrition, health hazards and death of employee. Succession planning is the strategy to ensure that a suitable person is made available during exigencies. Employees are developed for taking on higher responsibilities and for the new roles that may emerge in future. The placement of Key Leadership positions can be executed either by inviting the talent from outside or developing the talent in-house. The latter is always in demand keeping in view the core values of the organization and the impact on loyalty and organizational culture in a long run. It is preferable to develop the in-house talent pool to reduce dependency on recruitment of experienced people from outside for the critical roles. It brings the talent acquisition cost low and contributes as a motivating factor for the team as well. The acceptability of a person placed at Top / Key Leadership Positions is high when these are occupied
EXTERNAL BEHAVIOURAL FACTORS IMPACT ON INVESTMENT DECISIONS OF INDIVIDUAL INV...indexPub
The study collects data from a sample of individual investors and analyses their responses to recent financial events, changes in market trends, and economic forecasts. By examining factors such as demographic profiles, financial literacy, risk tolerance, and market perceptions, the research aims to identify significant predictors of investment decisions in this demographic. The findings suggest that investors are predominantly influenced by financial news, peer influence, past investment performance, and the economic stability of the region. This study contributes to the field by highlighting the localized factors impacting investment choices and providing insights for financial advisors and investment firms to tailor their strategies according to investor needs and regional specifics.
GLOBAL RESEARCH TREND AND FUTURISTIC RESEARCH DIRECTION VISUALIZATION OF WORK...indexPub
Purpose – The purpose of this research is to undertake a bibliometric analysis of working capital management. The study examines papers from time period 1974-2023and performed performance analysis, co-citation analysis, bibliographic coupling and scientific mapping. Design/methodology/approach – The study examines 174 articles retrieved from the Scopus database using bibliometric analysis, performance analysis and thematic clustering. The study looked at the scientific productivity of papers, prolific authors, most influencing papers, institutions and nations, keyword co-occurrence, thematic mapping, co-citations and authorship and country collaborations. VOSviewer was as a tool in the research to conduct the performance analysis and thematic clustering.The watchword "Working Capital Management" was used to include only English-language articles. Findings – The most productive year was 2022 with 26 publications. Martínez and García- are the most protuberant authors with 708 citations. The findings of the study shows that the most influential institutions are ‘The Department of Management and Finance, Faculty of Economy andBusiness and Department of Management and Finance, Faculty of Economics and Business, The University of Murcia, Spain with 381 & 297 citations. Among,thecountry analysis,Spain with 744 citations stands first of all other nations for publication on Working Capital Management. Kärri is the most productive author with 7 documents. Country-wise analysis reveals that the United States is the most productive country for Working Capital Management research with 40 documents.The authors also identified seven thematic clusters of Working Capital Management. Research limitations/implications – It informs and directs researchers on the current state of study in the field of Working Capital Management.The present study has quite a few implications forSmall & Medium enterprise managers, entrepreneurs, financial managers, academicians and scholars. It also outlines future research directions in this field.Present study provides an inclusive acquaintance about the working capital management till date. Originality/value – This is the first study which provides the performance analysis and scientific mapping of the all published documents on working capital management between the time periods 1974-2023
A SOCIAL CAPITAL APPROACH TO ENTREPRENEURIAL ECOSYSTEM AND INNOVATION: CASE S...indexPub
Despite being recognised as drivers of innovative development, Micro, Small, and Medium-Sized Enterprises (MSMEs) frequently confront resource limitations. Therefore, enhancing the ecosystem is contingent on the entrepreneurs’ social capital, which is crucial for the success of MSMEs. This study applies the social capital approach to analyse the entrepreneurial ecosystem enrichment and its impact on the innovation process of cosmetics MSMEs. The qualitative case study of six cosmetic manufacturing MSMEs explores that social capital is a multifaceted asset to MSMEs. Through an in-depth thematic analysis of three dimensions of social capital (structural, relational, and cognitive), this study states that the innovation process is supported by the synergistic transformation of one dimension of social capital into another. Entrepreneurs sharing the common norms, rules, and language enrich their cognitive as well as relational aspects of ecosystem. The study suggests that as network ties, trust, and norms collectively influence innovation in firms, hence, social capital needs to be studied with its contextualization in the ecosystem.
ASSESSING HRM EFFECTIVENESS AND PERFORMANCE ENHANCEMENT MEASURES IN THE BANKI...indexPub
This study employs an exploratory and quantitative research approach to systematically investigate the impact of Human Resource Management (HRM) practices on Organizational Performance within the Indian Banking sector. The research approach combines exploratory research, aimed at gaining insights into HRM practices, with a quantitative approach using a purposive sampling technique. Data is collected through a questionnaire from employees in both public (SBI) and private banks (HDFC Bank) who work in HR departments or are involved in HR activities. The Likert scale is utilized in the questionnaire to measure participant perceptions of HRM practices. The study utilizes two statistical tools: Neural Network and Exploratory Factor Analysis (EFA). The findings of the study highlight the significance of promotion and transfer policies, considered paramount in influencing organizational performance in both public and private banks. Additionally, the study underscores the importance of training and development initiatives in enhancing employee skills and competencies. Clear and effective communication within HR policies is identified as pivotal in improving organizational performance. Lastly, aligning HRM practices with sector-specific goals is recognized as a significant contributor to improved employee satisfaction and overall performance in the banking sector. The findings offer guidance for HR practitioners and policymakers in optimizing HRM practices to achieve better organizational performance.
CORRELATION BETWEEN EMPATHY AND FRIENDSHIP QUALITY AMONG HIGH SCHOOL STUDENTS...indexPub
In this research were used two questionnaires Empathy Formative questionnaire and Friendship Quality Scale. The aim of this study is to see the relationship between empathy and friendship quality among adolescent, to find out if there are gender differences in empathy and friendship quality, and to see if there are any differences between younger and older students on examined variables. This research was done with 65 high school students. Age of the students were 15 to 17 years old. Results show that there is a correlation between empathy and friendship quality. The results of t test show that there are not significant differences between females and males on variable empathy. Girls and boys have significant difference in friendship quality in Kosovo. There are no significant differences between older students and younger students in the level of empathy and also there are no significant differences between older students and younger students in the level of friendship quality.
LEVELS OF DEPRESSION AND SELF-ESTEEM IN STUDENTSindexPub
Introduction: among the most worrying problems in recent years are low self-esteem, family and friends problems, anxiety, stress, and depression, which are taking on alarming proportions in students and young people in general. Purpose: the study is a prediction, which focuses on analyzing and evaluating students' self-esteem and level of depression. Methodology: the population is 332 students (13-15 years old) in high schools in the Gjakova region. The study describes the analysis, classification, and evaluation of the collected data by doing the analysis and real examination of the findings. Results: in terms of gender there is no significant difference in self-esteem, while in depression there is a significant difference. The level of depression is higher in women (11.9) than in men (9.5). Economic status shows that students with employed fathers have lower depression (6.77) compared to those with unemployed fathers (10.80). Conclusions: The level of depression and self-esteem and parental reflection affect students. A link has been found between economic status and emotional problems and student behavior. To prevent it, the psychological service in schools should function, and together with families and the community should be as close as possible to the problems of students.
THE IMPACT OF SOCIAL FACTORS ON INDIVIDUALS DIAGNOSED WITH SCHIZOPHRENIAindexPub
The society with diverse structural and ideological influences, assumes its role in relation to behavior, attitude, belief and relations. The impact can be seen in every society globally, however the western nations have adjusted their social policies to suit these transformations, whereas nations in developing phase have failed to establish suitable systems. In Kosovo, the allocation of funds for mental health services remains insufficient, even though mental health disorders account for 12.3% of overall illnesses and 30.8% of work incapacities! The objective of this study is to examine the impact of society on the decline and recovery of individuals with schizophrenia. The study employs both qualitative and quantitative methods to provide a descriptive-analytical. A study was conducted in four municipalities of Kosovo, using individuals with schizophrenia from psychiatric institutions as subjects along with their caregivers/family members . The research found that social factors greatly contribute to the worsening of schizophrenia patients' condition. The presence of schizophrenia is evident through a higher likelihood of having a low level of education, high unemployment rates, and engaging in harmful behaviors like tobacco and alcohol use, as well as physical inactivity. Significant correlations have been observed in the subscales of positive and negative symptoms using the Self-Report PNS-Q questionnaires. It is crucial for individuals with schizophrenia to have a carefully designed strategy in place, developed in partnership with professionals from various relevant fields such as social protection, psychiatric medical services, education, and social integration plans.
RETURN ON EQUITY (ROE) AS MEDIATION OF BANK'S CAPITAL ADEQUATION RATIO (CAR)indexPub
Banks need to maintain their performance and the level of Capital Adequasi Ratio (CAR). This study wants to see the variables that affect the Capital Adequasi Ratio (CAR) and see ROE as a variable that mediates the Capital Adequasi Ratio (CAR) at Bank Rakyat Indonesia (BRI). The research method used multiple regression analysis, t-test, Anova test and Coefficient of Determination and the research period for 14 years from 2009 to 2022, by using SPSS Software version 26. The conclusion of the study, only the BOPO variable has a significant effect on the Capital Adequasi Ratio (CAR) and the ROE variable as a variable that can mediate the CAR variable at Bank Rakyat Indonesia (BRI). Keywords: Capital Adequasi Ratio, Bank Financial Ratio.
INNOVATIVE DESIGN FOR KIDS MASTERY IMPROVEMENT OF LANGUAGE FEATURES IN A STORYindexPub
One of the hardest things for people learning English as a third language is still reading and writing. Because they are still not good enough at language features, they often make big mistakes and assumptions that aren't true. To make learning more fun and useful, visual symbols were made for seven different kinds of language traits. It looks at the Vipicoll form a lot. Visual Symbols media, Picture and picture, and the Collaborative approach are all creatively put together in Vipicoll. This research used Reeve's design method. Research develops Vipicoll learning model, employing interviews, literature reviews, and questionnaires for iterative improvement and validation. Researchers identify problems, create Vipicoll, iteratively refine through trials, forming an effective English Language Education model. Study assesses individual English thinking development, emphasizing interpretive framework, relation, function, and unique visual symbols. From this research, it was found that using Vipicoll really helps improve kids' mastery of language features, especially those in a story. This is proven by the fact that after implementation, kids' correct answers when asked directly by their teachers and their written test answers increased greatly even though many direct answers and test answers used to be wrong and they often didn't understand.
CARDIOVASCULAR DISEASE DETECTION USING MACHINE LEARNING AND RISK CLASSIFICATI...indexPub
The global prevalence of heart disease indicates a major public health issue. It causes shortness of breath, weakness, and swollen ankles. Early heart disease diagnosis is difficult with current approaches. Hence, a better heart disease detection tool is needed. Treatment requires more than just diagnosis. Risk classification is critical for accurate diagnosis and treatment. In this analysis, a novel cardiovascular disease (CVD) detection paradigm using machine learning (ML) and risk classification based on a weighted fuzzy system is proposed. The system is developed based on ML algorithms such as artificial neural network (ANN) and Long Short-Term Memory (LSTM) and uses standard feature selection techniques knowns as Principal Component Analysis (PCA). Furthermore, the cross-validation method has been used for learning the best practices of model assessment and for hyperparameter tuning. The accuracy-based performance measuring metrics are used for the assessment of the performances of the classifiers. Finally, the outcomes revealed that the proposed model achieved an accuracy of 94.01% which is higher than another conventional model developed in this domain. Additionally, the proposed system can easily be implemented in healthcare for the identification of heart disease.
ANALYSIS OF FLOW CHARACTERISTICS OF THE BLOOD THROUGH CURVED ARTERY WITH MIL...indexPub
Narrowing of the arteries caused by atherosclerosis reduces blood flow to the heart, which results shows ischemia, angina pectoris, cerebral strokes, and other coronary artery disease signs and symptoms. Curvature is seen in blood vessels at various locations. The stenotic surface provides an additional curvature and the point of maximum shear which varies with the cross-section. A cylindrical form of the Navier-Stokes equations in polar coordinate system have been extended to include dynamic curvature along the axial direction. The blood flow behavior of taking different values of blood parameters like viscosity, the radius of the artery, and the thickness of the stenosis has been studied with and without curvature by using an extended blood flow model with dynamic curvature. Moreover, the aspects of blood flow, such as dynamic curvature velocity profile, volumetric flow rate, pressure drop, and shear stress, have been studied in relation to blood flow around curved arteries with stenosis, variations in the radii of the artery, thickness of the stenosis, and viscosity. The information may reveal that by increasing the values of curvature, viscosity, and thickness of stenosis, velocity, and volumetric flow rate can be quickly reduced. Increasing the curvature, viscosity, and thickness of stenosis also results in an increase in shear stress and a pressure drop. The presence of curved stenotic arteries has a significant impact on the flow parameters, and it is crucial to know about these dynamics in order to study the cardiovascular system.
ANALYSIS OF STUDENT ACADEMIC PERFORMANCE USING MACHINE LEARNING ALGORITHMS:– ...indexPub
Student academic performance is the great value of institutes, universities and colleges. All colleges majorly focus on the career development of students. The academic performance of students plays a vital role in the establishment of a bright career. On the basis of better academic performance, the placement of the students will be better and the same will be reflected in the form of better admission and future. Machine learning can be deployed for the prediction of student performance. Various algorithms are playing an important role in the prediction of the accuracy of various machine learning models. These articles discuss various algorithms that can be helpful to deploy for predicting student academic performance. The article discusses various methods, predictive features and the accuracy of machine learning algorithms. The primary factors used for predicting students performance are academic institution, sessional marks, semester progress, family occupation, methods and algorithms. The accuracy level of various machine learning algorithms is discussed in this article.
IMPLEMENTATION OF COMPUTER TECHNOLOGY IN BLENDED LEARNING MODELS: EFFECTS ON ...indexPub
This study was conducted to identify the influence of computer technology in blended learning on the achievement in the Principles of Accounting subject through of self-directed learning. The research also assessed the relationship of the elements of blended learning on student achievement. Despite the encouragement by the Ministry of Education for the use of Computer Technology In Education, there is a lack of research on a measurable and testable model of the influence of computer technology. In reality, various aspects such as schools, teachers, content, and technology exist to provide and utilize computer technology through learning in Malaysia. A quantitative study using a correlational design was conducted on 400 Form Four students in secondary schools in the Southern Zone of Malaysia, namely in the states of Johor, Melaka, and Negeri Sembilan, to identify the influence of computer technology in blended learning on achievement. Data were collected using adapted and modified questionnaires from previous studies. Descriptive data analysis was performed using SPSS version 28, while inferential analysis was conducted using the Smart PLS analysis technique. Smart PLS version 4.0 software was utilized to test the mediator relationships in the study. The results of the study showed high minimum scores for blended learning through computer technology and self-directed learning, as well as achievement. The influence of blended learning elements also had a significant relationship with student achievement in the Principles of Accounting subject. This study is expected to contribute to the effectiveness of blended learning through information technology on the achievement in the Principles of Accounting subject by enhancing self- directed learning among students. The development of this conceptual model is hoped to serve as a guide for policymakers, the Ministry of Education, teachers, students, and other stakeholders in ensuring that blended learning practices can be implemented more effectively. Furthermore, it is hoped that the achievement and interest in the Principles of Accounting subject can be improved by applying computer technology in learning.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Large Language Model (LLM) and it’s Geospatial Applications
SENTIMENT ANALYSIS OF BANGLA-ENGLISH CODE-MIXED AND TRANSLITERATED SOCIAL MEDIA COMMENTS USING MACHINE LEARNING
1. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 26
SENTIMENT ANALYSIS OF BANGLA-ENGLISH CODE-MIXED AND
TRANSLITERATED SOCIAL MEDIA COMMENTS USING MACHINE
LEARNING
FAHIMA HOSSAIN
Lecturer, Department of Computer Science & Engineering, Hamdard University Bangladesh, Bangladesh.
E-mail: minda.fahima25@gmail.com
NUSRAT JAHAN
BSc in CSE, Hamdard University Bangladesh, Bangladesh. E-mail: nusrat.jahan.az@gmail.com
JOYREYA ABADIN
BSc in CSE, Hamdard University Bangladesh, Bangladesh. E-mail: ajoyreya285@gmail.com
Abstract
In this era of technology, online communication has expanded to incorporate different language
combinations and expressive styles. Code-mixing, or combining languages such as Bengali and English,
is frequent in multilingual environments. In addition, the use of emojis, which are small graphical icons, has
increased the complexity of online communication by allowing for more emotional expression. Sentiment
analysis of comments on social media that are combined with emojis and written in Bangla-English
language is the main topic of this work. This research aims to understand and classify the sentiments
expressed in code-mixed comments, including the nuanced role of emojis. The data was preprocessed
first, then feature extraction was performed using the TF-IDF Vectorizer and CountVectorizer algorithms.
For the analysis, nine different machine learning algorithms were used. With a remarkable accuracy of
85.7% and an F1 score of 85.0%, the Support Vector Classifier stood out as the most successful model.
This highlights the utility of including emoji-based features for complex sentiment analysis, particularly when
dealing with code-mixed data. The dataset contained 2055 comments from Facebook pages, including
Bangla-English comments with and without emojis and comments that only contained emojis. To prepare
the dataset for analysis, preprocessing methods included removing irrelevant data and converting emojis
into Unicode short names.
Index Terms: Sentiment Analysis, Emoji Analysis, Natural Language Processing, Machine Learning, Code-
mixing, Emoji, Unicode.
1 INTRODUCTION
Today, individuals effortlessly blend languages while communicating online, which is
referred to as code-mixing, in order to express their thoughts and emotions. People who
are fluent in multiple languages often blend them together, incorporating English elements
into their native language. The fusion of English and Bengali, known as Benglish, is a
prime example of this phenomenon. It enables individuals to convey their thoughts and
ideas by using a blend of both languages. This linguistic phenomenon highlights the ever-
changing nature of communication within multilingual groups [1].
Sentiment analysis, a crucial area of natural language processing (NLP), focuses on
understanding and classifying the emotions and points of view expressed in textual data.
According to the analytic context, these attitudes are divided into a variety of groups,
2. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 27
including negative, neutral, and positive, with the possibility to further segment them into
extremely negative, strongly positive, and other categories [2]. This technique has a wide
range of applications, from business insights to election forecasts, by decoding feelings
in customer feedback, social media posts, and public opinion [3, 4]. However, whereas
sentiment analysis research thrives in English, Bangla lags behind due to linguistic
difficulty and a lack of resources [5].
Emojis have revolutionized modern communication and have become a common feature
in social media conversations [6]. Through the effective communication of emotions,
context, and intent, these Unicode symbols cross linguistic and cultural boundaries [7, 8,
9]. On Messenger, more than 900 million emoticons are exchanged every day, while
Facebook sees more than 700 million emojis shared in a day [10]. Emojis can substitute
for words and represent a variety of ideas, including emotions, objects, activities, and
more because of their adaptability [11]. Emoji usage is widespread, but little study has
been done on it [12, 13, 14]. Emojis and text are combined in this study, which recognizes
the variety of expressions they provide. Including emojis in textual sentiment analysis is
beneficial. They provide a universal language [7, 8, 9] for expressing emotions, enhance
prediction accuracy, and align with contemporary communication styles.
This study collects Facebook data, which consists of Bangla-English comments with and
without emojis, along with comments that only contain emojis. The dataset contains a
total of 2055 labeled points and has been preprocessed and balanced using TF-IDF and
Count Vectorizer before being split into training and testing. To perform sentiment
analysis, nine categorization algorithms are utilized.
This work aims to fill the research gap on extracting sentiment from code-mixed Bengali-
English texts with emojis in social media. The contributions of this study are as follows:
1. Created a dataset of tagged Bangla-English code-mix with added emojis.
2. Managed imbalanced data properly to reduce overfitting.
3. Created a customized Bangla-English lemmatizer.
4. Developed an empirical model for sentiment analysis.
This research is divided into numerous Sections that methodically cover various parts of
the investigation. Section 2 offers a concise summary of related works to determine the
current state of knowledge in the subject matter. In Section 3, a summary of the data-
gathering process is provided, including details on data storage and the methodology
used for the study. Furthermore, Section 4 evaluates the proposed model and presents
its outcomes, providing significant information on its effectiveness. The findings of the
paper are presented in Section 5, which also suggests possible directions for further
research.
3. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 28
2. LITERATURE REVIEW
Mahmud AA et al. [15], proposed an approach to sentiment analysis that supports Bangla-
English code-mixing and transliterated features. The authors pre-processed the data by
stopword remove, punctuation remove, pos tagging and for word embedding they used
glove, and word2vector, and after all that, LSTM and BERT were applied.
Khan et al. [16], proposed sentiment analysis approach for Bangla language and worked
with five different emotions: Happy, Sad, Angry, Surprised, and Excited. Additionally, their
paper also deals with two categories: Abusive and Religious. The authors used the TFiDF
method for converting the data into weighted vector form. The total work has done in two
experimental steps. In the first experiment, they trained the dataset with algorithms. In the
second experiment, they mapped their 7 classes (Sad, Happy, Angry, Excited, Religious,
Abusive, Surprised) into three higher classes (Positive, Negative, and Neutral). The best
accuracy (62%) for 7 class achieved using Support Vector Machine (SVM). And for 3
classes, 73% is the best accuracy obtained using Support Vector Machine (SVM).
Khan et al [17], proposed a model for depression analysis from Bangla post via Natural
Language Processing (NLP). For data cleaning they added contraction, removed regular
expression, removed stop words and tokenized the data using CountVectorizer. The
Multinomial Naive Bayes algorithm provide best accuracy for their model.
Al-Azani et al. [2], introduced a approach by incorporating nonverbal features, specifically
emojis, for sentiment analysis of Arabic microblogs where each instances, contained at
least one emoji. For feature selection, they used ReliefF and Correlation-Attribution
Evaluator (CAE) as their approach was text-independent and would only focus on emojis
features. The study achieves an F1 score of 80.30% and an AUC (Area Under the Curve)
of 87.30% by selecting multinomial naive Bayes classifier and 250 of the most relevant
emojis. The authors indicate that using emoji-based features alone can be highly effective
in detecting sentiment polarity.
Mandal et al. [18], addressed sentiment analysis for code-mixed social media content.
They created a Bengali-English dataset and utilized hybrid approaches for language and
sentiment tagging, obtaining 81% accuracy in language identification and 80.97%
accuracy in sentiment classification. The research paper provides valuable resources for
code-mixed sentiment analysis, which could potentially inspire more research in this
complex field of natural language processing.
Singh et al. [1], describe their findings on sentiment analysis of Hinglish (English-Hindi
mix) code-mixed social media tweets. Data consolidation, cleansing, transformation, and
modeling are all part of the research. There were several vectorization approaches and
machine learning algorithms utilized. The SemEval-2020 dataset was divided into
positive, neutral, and negative feelings. The ensemble voting classifier attained the best
F1-score (69.07), suggesting the possibility for additional improvements employing neural
networks and advanced data processing techniques. The research underlines the need
to normalize Hindi spellings and discusses potential improvements for Hindi words using
neural networks and methods such as lemmatization and part-of-speech tagging.
4. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 29
Alzubaidi et al. [14], employed heterogeneous data mining (HDM) for integrated
processing to explore sentiment analysis of tweets utilizing both text and emojis. The
results indicate that using HDM along with text or emojis helps improve sentiment analysis
accuracy. With Support Vector Machines (SVM), the accuracy reached 85.89%. This
highlights the significance of incorporating emojis to enhance bi-sense sentiment analysis
accuracy.
Tareq et al. [19], in this study the author's utilized a Bangla-English code-mixed (BE-CM)
dataset they created to address challenges related to sentiment analysis in low-resource
languages like Bangla. Their data augmentation technique improves cross-lingual
comprehension in code-mixed expressions, achieving an 87% weighted F1 score using
XGBoost and FastText embeddings. The work tackles the issues of sentiment analysis in
multilingual communities, particularly for languages with limited resources like Bangla.
Jamatia et al. [20], the paper research sentiment analysis in Indian language code-mixed
social media texts. For sentiment prediction in English-Hindi and English-Bengali, they
compare standard machine learning approaches with deep learning models such as
BiLSTM-CNN, Double BiLSTM, Attention-based, and BERT. The deep learning models
consistently outperformed the classical methods, which is pretty impressive. The
Attention-based model achieved the highest F1 scores of 67.5 (English-Bengali) and 60.4
(English-Hindi). It's interesting to note that while domain-specific BERT embeddings show
potential, pre-trained BERT embeddings are still pretty competitive. Overall, the study
highlighted the challenges in code-mixed sentiment analysis and emphasized the value
of deep learning as well as the need for more extensive datasets.
Limitations: The existing literature reveals a variety of approaches to sentiment analysis,
with some focusing solely on code-mix languages, others on emojis exclusively, and
some on text and emojis in native languages. However, there is still a lack of research
on Bangla-English code-mixed language, where English elements are integrated into
native language text through phonetic typing. Our study presents a sentiment analysis
framework for Bangla-English code-mixed language with emojis, as no previous
exploration has been conducted on this topic yet. The studies discussed in this section
are summarized in Table 1 based on various attributes.
Table 1: Overview of Relevant Studies Referenced from [1, 2 14-20].
Authors Contributions Datasets Class
Performances
(Best Score)
Mahmud
AA et al.
[15]
1. Applied two types of feature
extraction
methods to convert text into
vectors.
2. Plotted 100 dimensions’ vectors
into 2 dimensional graphs.
3. Used test set differently to
measure the performance of the
model. The first test set contains
10% code mix data and the second
Bkash app
review from
google playstore
(0.2 million
sample with 12%
bangla-english
mixed sentence)
Positivie,
Negative
Accuracy: 94%
5. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 30
test set contains 42% code mix
data.
Khan et
al. [16]
1. Created dataset of pure Bangla
text or Bangla text mixed with
some characters.
2. Applied Tf-idf as feature
extraction
Facebook post
comments
Positivie,
Negative,
Neutral
Accuracy: 73%
Khan et al
[17]
1. Build a model to detect
depression related post.
2. Applied Tf-idf as feature
extraction
Bengali post of
facebook
Happy, Sad Accuracy:
86.67%
Al-Azani
et al [2]
1. Conducted research on Arabic
sentiment analysis to evaluate
the impact of different emojis
on predicting sentiment
polarity.
Arabic
microblogs
(2091 sample)
Positivie,
Negative
Accuracy:
80.34%
Soumil
Mandal et
al. [18]
1. Create a gold standard Bengali-
English code-mixed dataset.
2. Develop an accurate language
identification system with an
81% accuracy rate.
3. The dataset, which includes
code-mixed text and is
annotated with language and
polarity tags, is now accessible
in JSON format for future
research purposes.
Gold standard
Bengali-English
code-mixed
dataset
(600 tweets)
Positive,
Negative,
And
Neutral
Accuracy:
80.97%
Singh et
al. [1]
1.Applied different data-
cleaning approaches,
transformation techniques, and
machine-learning algorithms.
2. Identified areas for growth and
development, such as neural
network exploration and spell
normalization.
Task 9:
Sentiment
Analysis of
Code-Mixed
Social Media
Text (Hinglish)
(14,000 tweets)
[21]
Positive,
Negative,
And
Neutral
F1-score:
69.07%
Alzubaidi
et al. [14]
1. Uses texts and emojis to
demonstrate HDM's
supremacy for sentiment
analysis.
2. Highlights the efficacy of emojis
even in small quantities.
3. Introduces a uniform approach
for both text and emoji
processing.
English corpus
composed of
16,207 tweets
[22]
Positive
and
Negative
Accuracy:
85.89%
Tareq et
al. [19]
1. Creating the BE-CM in code-
mixed Bangla-English text.
2. Evaluating several models and
BERT iterations for sentiment
categorization.
3. Introducing an effective data
augmentation strategy for
Gold standard
Bangla-English
code mix (BE-
CM) dataset
(18,074
Sentences)
Positive
state,
Negative
state,
Mixed
positive,
Mixed
F1- score: 87%
6. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 31
cross-lingual sentiment
analysis.
negative
and Neutral
Jamatia et
al. [20]
1. Demonstrates the effectiveness
of deep learning in code-mixed
sentiment analysis.
2. Increse sentiment analysis in
multilingual contexts
ICON-2017 SAIL
Tool-Contest
Annotated
Corpora [43],
Joshi et al. [24],
SemEval 2013
Task 2B Twitter
Dataset [25],
Barnes et al. [26]
Positive,
Negative,
And
Neutral
F1-scores:
67.5%
(English-
Bengali) and
60.4%
(English-Hindi)
3. METHODOLOGY
The proposed model working mechanism to detect sentiment is present in this section.
Figure 1 illustrates the approach used in this model.
Fig. 1: Flow Diagram of the Proposed Model.
3.1 Data collection and dataset preparation
Machine learning heavily relies on data as a fundamental component. The data used to
utilize this paper is the comments of general people that have been collected manually
from different Facebook pages using facepager. Facepager uses a JSON-based API to
fetch publicly accessible data from Facebook, YouTube, and Twitter. An access token is
required, generated automatically when a profile is logged in. An additional method for
obtaining access tokens is through the utilization of a website called the 'Graph API
7. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 32
Explorer - Facebook Developer.' Facepager exports raw data as an Excel CSV sheet and
stores it in a SQLite database (2).
2055 comments have been collected that are mainly Bangla, written using English letters
(generally known as Banglish), mix of Banglish comments with or without emoji, or only
emoji. Along with these, some commonly used English words and sentences that are
mostly used (such as: ok, well done, good, problem, best of luck, etc) are also consumed
as collected comment. The collected comments are then exported as an Excel CSV
sheet. Each comment has been labeled manually based on its real aspect of use. Overall
nine columns are present in the dataset that is explained in Table 2. Figure 1 represents
the sample scenario of the dataset.
Table 2: Columns included in the dataset.
Column Description
Page_link The page link of particular post from which comments are collected
Page_id The page id of particular post from which comments are collected
Query_time The time when the comments have collected
Post_id The post id of post from which comments are collected
Post_created_time The post created time of post from which comments are collected
Post The post from which comments are collected
Comment_created_time Commented time
Comment Comment of general people
Sentiment Class (positive, Negative, Neutral) of comment
Fig. 2: The sample scenario of the dataset.
8. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 33
3.2 Data preprocessing
To enhance the performance of the machine learning model raw data needs to be
preprocessed. At the very first step of data preprocessing, keeping only comments and
sentiment columns, all other unnecessary columns have been dropped. The
preprocessing steps followed for this model are:
1. Handling missing data: The handling of missing data operations is carried out on the
entire dataset. As missing data can bias the result of the model, If a row with a missing
value is detected, the entire row is dropped.
2. Remove non relevant information: hyperlink, URL, email, name mention, hashtag,
number, new lines, and extra space have no significant rules for detecting sentiment.
So, these are removed from the comments.
3. Convert in lowercase: To avoid any case sensitivity all letters are converted into
lowercase.
4. Remove punctuations: Punctuation as like ('!"#$%&'()*+,-./:;<=>?@[]^_`{|}~') has
been removed from comments.
5. Convert emoji to Unicode CLDR: Each emoji has a unicode CLDR short name, like
(😢 | crying face). For better performance in the model, emojis are converted into
Unicde CLDR shortname.
6. Remove duplicate values: This operation is done in two steps. In the first step, if any
word contains sequentially repeated letters more than twice, then the repeated letters
have been removed. Such as (onnnnkkk-onk). In the second step, if any word is
sequentially repeated, then the repeated word has been removed. Such as (sokal
sokal egula valo lage na - sokal egula valo lage na).
3.3 Tokenization
After completing the data preprocessing steps to convert the raw dataset into a clean
dataset, each data is converted into a list of tokens.
3.4 Normalization
In Bangla-English code mixed sentences there are various forms of words that are spelled
differently but pronounced similarly which can be converted to one single form. As like
(valo-vlo-bhalo, kharap-khrp etc). Another main things among words are, some people
like to write short form of that word and some write the enlarge word. For the proposed
model, normalization list have created manually with the words present in the dataset.
The first word was used for replacing the variations of the spellings of the same Banglish
word.
9. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 34
Fig. 3: Sample of Data Preprocessing
Fig. 4: Sample of Created Normalization.
3.5 Feature Extraction
Each word are then transformed into numerical value using two different feature
extraction method to evaluate which performs better for the model. Using TF-IDF and
countVectorizer each unique token gets a feature index. Finally, each comment becomes
a vector, and the weighted numbers of each vector represent the score of features.
10. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 35
3.6 Data Balancing
The dataset contains 887 Neutral, 559 Negative, and 609 Positive comments. This
implies that the classes of the dataset are imbalanced which can effects the performance
of the machine-learning model. To address the imbalanced data, there are several
different approaches that can be taken, including both undersampling and oversampling
techniques. Synthetic Minority Over-sampling Technique (SMOTE) has been used for this
model to make a balance class dataset.
Fig. 5: Class distribution of the dataset.
3.7 Dataset Distribution and Classification
From the dataset 80% kept for train the model and 20% kept for evaluate the performance
of model. For classifying comments on the basis of their sentiment into three classes:
Positive, Negative and Neutral, nine machine learning algorithm has used; Bernoulli naive
Bayes (BNB), Multinomial naive Bayes (MNB), Gaussian naive Bayes (GNB), Decision
Tree Classifier (DT), K-Nearest Neighbour (KNN), Logistic regression (LR), Linear
Support Vector Classifier (SVC), Support Vector Classifier (SVC) and random forests
(RF).
4. RESULT ANALYSIS
The accuracy of several algorithm used to train the model dataset is shown in Figure 6.
The barchart highlights that the algorithms obtained comparatively better accuracy using
Term Frequency and Inverse Document Frequency (TF-IDF) rather than CountVectorizer
feature extraction method. The highest accuracy, 85.7% obtained by Support Vector
Classifier (SVC) using Term Frequency and Inverse Document Frequency (TF-IDF). The
lowest accuracy obtained by K-Nearest Neighbour algorithm for both TF-IDF and
CountVectorizer.
11. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 36
Fig. 6: Predictive accuracy barchart for TF-IDF and CountVectorizer.
For more depth analysis, three distinct metrics—Precision, Recall and F1-Score have
been identified for implemented classification algorithms to evaluate and better
comprehend the model's performance and predictive outcomes.
Fig. 7: Predictive Precision barchart for TF-IDF and CountVectorizer.
12. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 37
Fig. 8: Predictive recall barchart for TF-IDF and CountVectorizer.
Table 3: Precision, Recall and F1-Score of algorithms using Countvectorizer.
Classification Precision Recall F1-Score
BernoulliNB 0.66 0.64 0.63
GaussianNB 0.55 0.52 0.49
MultinomialNB 0.71 0.70 0.69
DecisionTreeClassifier 0.68 0.68 0.66
K-Nearest Neighbors 0.60 0.57 0.52
LogisticRegression 0.74 0.74 0.73
Linear Support Vector Classifier 0.72 0.72 0.71
Support Vector Classifier 0.70 0.69 0.69
RandomForestClassifier 0.70 0.70 0.69
Table 3 and Table 4 shows the results for the nine considered classifiers using two
different feature extraction method.
Table 4: Precision, Recall and F1-Score of algorithms using TF-IDF.
Classification Precision Recall F1-Score
BernoulliNB 0.74 0.74 0.73
GaussianNB 0.70 0.67 0.64
MultinomialNB 0.79 0.79 0.78
DecisionTreeClassifier 0.72 0.72 0.71
K-Nearest Neighbors 0.76 0.58 0.52
LogisticRegression 0.81 0.81 0.81
Linear Support Vector Classifier 0.81 0.82 0.81
Support Vector Classifier 0.87 0.86 0.85
RandomForestClassifier 0.78 0.79 0.79
13. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 38
5. CONCLUSION & FUTURE WORK
This study presents a new technique for sentiment analysis of Bangla-English code-mixed
language, including emojis, in social media communication. The study emphasizes the
challenges of sentiment analysis in code-mixed language. It also highlights the
importance of emojis for nuanced emotional interpretation, while acknowledging limited
previous research. This study contributes to sentiment analysis by constructing a dataset,
handling imbalanced data, creating a Bangla-English lemmatizer, and evaluating
classification algorithms extensively. The results show that using the Term Frequency
and Inverse Document Frequency (TF-IDF) feature extraction method with the Support
Vector Classifier (SVC) achieved an accuracy of 85.7% and an F1 score of 85.0%.
This study proposes several potential directions for future exploration in sentiment
analysis. These include expanding datasets, incorporating additional linguistic features,
improving preprocessing techniques, and optimizing language variations to enhance
effectiveness. Additionally, it aims to develop emotion recognition models that incorporate
emojis, stickers, GIFs, and other elements to examine emotions in code-mixed text.
These future research paths have the potential to not only improve the accuracy of
sentiment analysis but also deepen our understanding of sentiment dynamics in code-
mixing communities.
Acknowledgement
This project is financially supported by a research grant from Hamdard University Bangladesh.
References
1) Gaurav Singh, “Sentiment Analysis of Code-Mixed Social Media Text (Hinglish)”, School of
Computing, University of Leeds, Leeds, LS29JT, UK, doi.org/10.48550/arXiv.2102.12149.
2) Al-Azani S, El-Alfy ESM. Emoji-Based Sentiment Analysis of Arabic Microblogs Using Machine
Learning. In: 2018 21st Saudi Computer Society National Computer Conference (NCC) [Internet].
Riyadh: IEEE; 2018 [cited 2023 Aug 18]. p. 1–6. Available from:
https://ieeexplore.ieee.org/document/8592970/
3) GS. Solakidis et al. 2014. Multilingual sentiment analysis using emoticons and keywords. Proc. - 2014
IEEE/WIC/ACM Int. Jt. Conf. Web Intell. Intell. Agent Technol. - Work. WI-IAT 2014 2, (2014), 102–
109.
4) AJ Shamal. 2019. Sentiment Analysis using Token2Vec and LSTMs: User Review Analyzing Module.
(2019), 48–53.
5) Nusrath Tabassum and Muhammad Ibrahim Khan, ”Design an Empirical Framework for Sentiment
Analysis from Bangla Text using Machine Learning”, 2019 International Conference on Electrical,
Computer and Communication Engineering (ECCE), 7-9 February, 2019,
doi.org/10.1109/ECACE.2019.8679347
6) L. K. Kaye, S. A. Malone, and H. J. Wall, “Emojis: Insights, affordances, and possibilities for
psychological science,” Trends in cognitive sciences, vol. 21, no. 2, pp. 66–68, 2017
7) “Emoji: The 21st Century Language", Divyansh Kandpal FEB 14, 2021, 18:57 IST,
www.timesofindia.indiatimes.com/readersblog/dkwrites.
14. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 39
8) “World Emoji Day 2023: Celebrating the universal language of emojis and their digital impact”, Govind
Choudhary, 17 Jul 2023, www.livemint.com/authors/govind-choudhary
9) “Emoji is the new universal language. And it’s making us better communicators”, Vyvyan Evans,
Professor of Linguistics | Researcher, published, Aug 5, 2017, Linkedin,
www.linkedin.com/pulse/emoji-new-universal-language-its-making-us-better-vyv-evans
10) “Emojipedia Shares Most Used Emojis of 2023 for World Emoji Day”, Andrew Hutchinson, Published
July 17, 2023, www.socialmediatoday.com/news/emojipedia-shares-most-used-emojis-2023-world-
emoji-day/688143/
11) P. K. Novak, J. Smailovic, B. Sluban, and I. Mozeti ´ c, “Sentiment of ˇ emojis,” PloS one, vol. 10, no.
12, p. e0144296, 2015.
12) https://en.oxforddictionaries.com/word-of-the-year/word-of-the-year-2015
13) G. Abdulsattar et al. 2018. Stock Market Classification Model Using Sentiment Analysis on Twitter
Based on Hybrid Naive Bayes Classifiers.
14) Mohammed Alzubaidi and Farid Bourennani. 2021. Sentiment Analysis of Tweets Using Emojis and
Texts. In 2021 The 4th International Conference on Information Science and Systems (ICISS 2021),
March 17–19, 2021, Edinburgh, United Kingdom. ACM, New York, NY, USA, 4 pages.
https://doi.org/10.1145/ 3459955.3460608
15) Mahmud AA. Artificial Intelligence in Business Decision Making: A Study on Code-Mixed and
Transliterated Bangla Customer Reviews. SSRN Electron J [Internet]. 2021 [cited 2023 Aug 25];
Available from: https://www.ssrn.com/abstract=3875534
16) Khan MdSS, Rafa SR, Abir AEH, Das AK. Sentiment Analysis on Bengali Facebook Comments To
Predict Fan’s Emotions Towards a Celebrity. J Eng Adv. 2021 Jul 23;118–24.
17) Khan MdRH, Afroz US, Masum AKM, Abujar S, Hossain SA. Sentiment Analysis from Bengali
Depression Dataset using Machine Learning. In: 2020 11th International Conference on Computing,
Communication and Networking Technologies (ICCCNT) [Internet]. Kharagpur, India: IEEE; 2020
[cited 2023 Aug 18]. p. 1–5. Available from: https://ieeexplore.ieee.org/document/9225511/
18) Soumil Mandal, Sainik Kumar Mahata and Dipankar Das, “Preparing Bengali-English Code-Mixed
Corpus for Sentiment Analysis of Indian Languages”, 11 Mar 2018,
https://doi.org/10.48550/arXiv.1803.04000
19) M. Tareq, M. F. Islam, S. Deb, S. Rahman and A. A. Mahmud, "Data-Augmentation for Bangla-English
Code-Mixed Sentiment Analysis: Enhancing Cross Linguistic Contextual Understanding," in IEEE
Access, vol. 11, pp. 51657-51671, 2023, doi: 10.1109/ACCESS.2023.3277787.
20) Jamatia, S. D. Swamy, B. Gambäck, A. Das, S. Debbarma .(2020). Deep Learning Based Sentiment
Analysis in a Code-Mixed English-Hindi and English-Bengali Social Media Corpus. International
Journal on Artificial Intelligence Tools. doi:10.1142/s0218213020500141.
21) Patwa, P., Aguilar, G., Kar, S., Pandey, S., Pykl, S., Garrette, D., Gambck, B., Chakraborty, T., Solorio,
T. and Das, A. 2020. SemEval-2020 Sentimix Task 9: Overview of SENTIment Analysis of Code-
MIXed Tweets. Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-
2020)
22) A.M. MacEachren et al. 2010. Geo-Twitter Analytics: Applications in Crisis Management. Proc. 25th
Int. Cartogr. Conf. (2010), 1–8.
23) B. G. Patra, D. Das, and A. Das. Sentiment analysis of code-mixed indian languages: An overview of
sail code-mixed shared task @icon-2017. CoRR, abs/1803.06745, 2018.
15. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/
Journal of Tianjin University Science and Technology
ISSN (Online):0493-2137
E-Publication: Online Open Access
Vol: 56 Issue: 11:2023
DOI: 10.5281/zenodo.10152990
Nov 2023 | 40
24) Joshi, A. Prabhu, M. Shrivastava, and V. Varma. Towards sub-word level compositions for sentiment
analysis of Hindi-English code mixed text. In Proceedings of the 26th International Conference on
Computational Linguistics, pages 2482–2491, Osaka, Japan, Dec. 2016. ACL.
25) P. Nakov, Z. Kozareva, S. Rosenthal, V. Stoyanov, A. Ritter, and T. Wilson. SemEval-2013 Task 2:
Sentiment analysis in Twitter. In Second Joint Conference on Lexical and Computational Semantics
(*SEM), Volume 2: Proceedings of the 7th International Workshop on Semantic Evaluation, pages
312–320, Atlanta, Georgia, June 2013. ACL.
26) J. Barnes, R. Klinger, and S. Schulte im Walde. Assessing state-of-the-art sentiment models on state-
of-the-art sentiment datasets. In Proceedings of the 8th Workshop on Computational Approaches to
Subjectivity, Sentiment and Social Media Analysis, pages 2–12, Copenhagen, Denmark, Sept. 2017.
Association for Computational Linguistics.