In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of expressing any opinions, thoughts towards anything even support their status against any social or political matter at the same time. Nowadays, people connected to those networks are more likely to prefer to employ themselves utilizing these online platforms to exhibit their standings upon any political organizations
participating in the election throughout the whole election period. The aim of this paper is to predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract the
information from the tweet data to count a virtual vote for each corresponding political group. The original results of the election closely match the findings of our investigation, published by the Australian Electoral Commission.
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or
political matter at the same time. Nowadays, people connected to those networks are more likely to prefer
to employ themselves utilizing these online platforms to exhibit their standings upon any political
organizations participating in the election throughout the whole election period. The aim of this paper is to
predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian
federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract
the information from the tweet data to count a virtual vote for each corresponding political group. The
original results of the election closely match the findings of our investigation, published by the Australian
Electoral Commission.
This is a presentation given at the ICWSM 2010 in Washington, DC (www.icwsm.org). You can watch a video of the presentation on videolectures.net
Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a content-analysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets’ political sentiment demonstrates close correspondence to the parties' and politicians’ political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research.
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Ferdin Joe John Joseph PhD
Presented at the 4th International Conference on Information Technology InCIT 2019 organised by Thai-Nichi Institute of Technology and Council of IT Deans in Thailand (CITT)
Twitter Based Election Prediction and AnalysisIRJET Journal
This document discusses using Twitter data to predict election outcomes through sentiment analysis. It begins with an introduction to election prediction methods and why social media data is being explored as an alternative. The paper then reviews related work on using features like user profiles, linguistic content, and sentiment analysis of tweets mentioning candidates. It describes the methodology used, including data collection from Twitter's API, preprocessing tweets, and performing sentiment analysis using both machine learning and lexicon-based approaches. The results section shows the sentiment analysis identified more positive tweets for Clinton and more negative tweets for Trump, suggesting Clinton would win. Emotion analysis found more tweets expressing sadness for Clinton and joy for Trump.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOKIJwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24 hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these social platforms are now part of everyday life. Thus, these social networks have become important sources to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write messages about current events, give their opinion on any topic and discuss social issues more and more.
Sentiment analysis of pre elections tweets (general elections)sahiba javid
This document describes a study conducted on analyzing sentiment of tweets related to the 2019 Indian general elections. Over 12 lakh election tweets were collected from January to May 2019 using Twitter API. The tweets were classified into 5 sentiment classes using SVM and RNN classifiers. RNN achieved higher accuracy of 82.97%. Analysis of tweets showed increase in volume as elections approached. A training dataset of 25k manually annotated tweets was also created for future sentiment analysis of election tweets. The study aims to perform multi-lingual, multi-party and state-wise analysis in future.
IRJET - Political Orientation Prediction using Social Media ActivityIRJET Journal
This document discusses research into predicting the political orientation of Twitter users based on their social media activity. The researchers aim to incorporate factors like tweets, retweets, followers, followees, and network connections to better understand how political views are expressed and shaped on Twitter. Prior studies that have analyzed political bias in media outlets and the spread of information across partisan networks on social media are reviewed. The researchers describe collecting and analyzing Twitter data including tweets, retweets, mentions, followers and who a user follows to predict individual users' political leanings.
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
Abstract
Using social media for political analysis, especially during elections, has become popular in the past few years where many researchers and media now use social media to understand the public opinion and current trends. In this paper, we investigate methods for using Twitter to analyze public opinion and to predict U.S. Presidential Primary Election results. We analyzed over 13 million tweets from February 2016 to April 2016 during the primary elections, and we looked at tweets that mentioned either Hillary Clin- ton, Bernie Sanders, Donald Trump or Ted Cruz. First, we use the methods of sentiment analysis, geospatial analysis, net- work analysis, and visualizations tools to examine public opinion on twitter. We then use the twitter data and analysis results to propose a prediction model for predicting primary election results. Our results highlight the feasibility of using social media to look at public opinion and predict election results.
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or
political matter at the same time. Nowadays, people connected to those networks are more likely to prefer
to employ themselves utilizing these online platforms to exhibit their standings upon any political
organizations participating in the election throughout the whole election period. The aim of this paper is to
predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian
federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract
the information from the tweet data to count a virtual vote for each corresponding political group. The
original results of the election closely match the findings of our investigation, published by the Australian
Electoral Commission.
This is a presentation given at the ICWSM 2010 in Washington, DC (www.icwsm.org). You can watch a video of the presentation on videolectures.net
Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a content-analysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets’ political sentiment demonstrates close correspondence to the parties' and politicians’ political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research.
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Ferdin Joe John Joseph PhD
Presented at the 4th International Conference on Information Technology InCIT 2019 organised by Thai-Nichi Institute of Technology and Council of IT Deans in Thailand (CITT)
Twitter Based Election Prediction and AnalysisIRJET Journal
This document discusses using Twitter data to predict election outcomes through sentiment analysis. It begins with an introduction to election prediction methods and why social media data is being explored as an alternative. The paper then reviews related work on using features like user profiles, linguistic content, and sentiment analysis of tweets mentioning candidates. It describes the methodology used, including data collection from Twitter's API, preprocessing tweets, and performing sentiment analysis using both machine learning and lexicon-based approaches. The results section shows the sentiment analysis identified more positive tweets for Clinton and more negative tweets for Trump, suggesting Clinton would win. Emotion analysis found more tweets expressing sadness for Clinton and joy for Trump.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOKIJwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24 hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these social platforms are now part of everyday life. Thus, these social networks have become important sources to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write messages about current events, give their opinion on any topic and discuss social issues more and more.
Sentiment analysis of pre elections tweets (general elections)sahiba javid
This document describes a study conducted on analyzing sentiment of tweets related to the 2019 Indian general elections. Over 12 lakh election tweets were collected from January to May 2019 using Twitter API. The tweets were classified into 5 sentiment classes using SVM and RNN classifiers. RNN achieved higher accuracy of 82.97%. Analysis of tweets showed increase in volume as elections approached. A training dataset of 25k manually annotated tweets was also created for future sentiment analysis of election tweets. The study aims to perform multi-lingual, multi-party and state-wise analysis in future.
IRJET - Political Orientation Prediction using Social Media ActivityIRJET Journal
This document discusses research into predicting the political orientation of Twitter users based on their social media activity. The researchers aim to incorporate factors like tweets, retweets, followers, followees, and network connections to better understand how political views are expressed and shaped on Twitter. Prior studies that have analyzed political bias in media outlets and the spread of information across partisan networks on social media are reviewed. The researchers describe collecting and analyzing Twitter data including tweets, retweets, mentions, followers and who a user follows to predict individual users' political leanings.
Using Tweets for Understanding Public Opinion During U.S. Primaries and Predi...Monica Powell
Abstract
Using social media for political analysis, especially during elections, has become popular in the past few years where many researchers and media now use social media to understand the public opinion and current trends. In this paper, we investigate methods for using Twitter to analyze public opinion and to predict U.S. Presidential Primary Election results. We analyzed over 13 million tweets from February 2016 to April 2016 during the primary elections, and we looked at tweets that mentioned either Hillary Clin- ton, Bernie Sanders, Donald Trump or Ted Cruz. First, we use the methods of sentiment analysis, geospatial analysis, net- work analysis, and visualizations tools to examine public opinion on twitter. We then use the twitter data and analysis results to propose a prediction model for predicting primary election results. Our results highlight the feasibility of using social media to look at public opinion and predict election results.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK dannyijwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24
hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these
social platforms are now part of everyday life. Thus, these social networks have become important sources
to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write
messages about current events, give their opinion on any topic and discuss social issues more and more.
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
Social Networks has become one of the most popular platforms to allow users to communicate, and share their interests without being at the same geographical location. With the great and rapid growth of Social Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content. Thus, the improvement in the information quality and integrity becomes a great challenge to all social media sites, which allows users to get the desired content or be linked to the best link relation using improved search / link technique. So introducing semantics to social networks will widen up the representation of the social networks. In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is based on the concept of multi-agent systems. In this proposed model the representation of social links will be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag Rank based on the indexing phase has improved the output of the ranking. Simulation results of the proposed model have shown improvements in indexing and ranking output.
1. The study analyzed the relationship between social media popularity (Facebook fans and Twitter followers) and polling data of 8 Republican presidential candidates from June to December 2011.
2. The results showed a significant correlation between total Facebook fans and polling numbers, but no correlation with Facebook growth rates. There was also a correlation between total Twitter followers and polling numbers after adjusting for outliers.
3. While social media data cannot replace polling, it provides additional insights and may help predict election trends in near real-time, complementing traditional polling methods. More research is still needed to better understand the dynamics.
The Pessimistic Investor Sentiments Indicator in Social NetworksTELKOMNIKA JOURNAL
This document proposes a method to calculate a pessimistic investor sentiments indicator using social network data. It defines pessimistic investor sentiment as consisting of depression, disappointment, fear, anxiety, panic, dread and despair. The frequency of these sentiments is counted from social media posts. An entropy-based formula is used to calculate the indicator, taking into account expert-assigned weights. Applying this method to Chinese stock market data from March 2016 generated time-series values of the indicator that discriminated sentiment changes more clearly when incorporating the weights. The proposed indicator provides a quantitative measure of pessimistic investor sentiment from social networks.
IRJET - Election Result Prediction using Sentiment AnalysisIRJET Journal
This document proposes a method to predict election results using sentiment analysis of social media data. It involves collecting data from Twitter, Facebook, and Instagram using their APIs. The data will then be preprocessed by removing special characters and URLs. Popular machine learning algorithms like Naive Bayes and SVM will be trained on the preprocessed data to classify tweets as positive, negative, or neutral sentiment toward political parties. The classified tweets will then be analyzed to predict the outcome of elections.
This document describes a system created by researchers to analyze sentiment in tweets about 2012 US presidential candidates in real-time. The system collects tweets through the Twitter API, preprocesses them by tokenizing text, matches tweets to candidates, analyzes sentiment using a model trained on Twitter language, aggregates sentiment results by candidate, and visualizes the analysis through interactive dashboards. It provides up-to-the-minute insight into how public opinion responds to political events as expressed on Twitter.
This document summarizes research on using Google Trends data to forecast the results of the 2012 French presidential election. It begins with a literature review on how the internet has been used in political campaigns, particularly in the US. It also discusses previous research using Google Trends in political analysis and forecasting. The document then applies a "forecasting the present" method developed by Choi and Varian to predict the French election results based on combining daily Google Trends and poll data. Finally, it analyzes the French campaign using Google Trends to identify different phases and how candidates' strategies were impacted by institutional rules. The analysis suggests candidates should use limited resources early to control the campaign agenda before equal airtime rules take effect.
This document outlines a presentation on big data for development (BD4D). It discusses the rise of big data and how BD4D techniques like data analytics can be applied. Potential BD4D applications include healthcare, emergency response, and agriculture. Data sources include mobile phones, crowdsourcing, and social media. The presentation also covers BD4D research in Pakistan using mobile data and challenges like data bias, privacy and causation. Open research areas are suggested to further mitigate challenges and advance predictive and multimodal BD4D analytics.
This document summarizes the results of a survey about social media usage among young Europeans. Key findings include:
- Facebook was the most widely used social network across all countries surveyed. However, some countries had prominent national social networks like StudiVZ in Germany.
- Usage of social networks was nearly universal among respondents except in Romania where only 85% used them.
- Frequency of usage was very high, with most respondents accessing their primary social network multiple times per day or once per day.
- Reasons for not using social networks varied but included lack of interest, perceiving them as useless or a waste of time, and not wanting to develop online relationships.
This document discusses negotiating between digital and analog research methods in political science. It outlines experiences examining Finnish political party platforms and government programs digitally from 1880-2012 and 1917-2015 respectively. Key findings include:
1) Wordfish and structural topic modeling reveal changing emphases over time from "traditional" to "modern" issues in party platforms.
2) Government programs covered nation-building, economic development, and later welfare policy.
3) Digital methods provide new insights but must be carefully integrated with theory and interpretation to be persuasive to traditional political science. Hybrid research combining methods is most promising.
NEGOTIATING THE ANALOG MAINSTREAM WITH DIGITAL METHODS IN HAND VISIONS FROM T...Pertti Ahonen
Dr. Pertti Ahonen, Professor, Faculty of Social
Sciences, University of Helsinki, Finland,
pertti.ahonen@helsinki.fi
ESS Digital Sociology Mini-Conference,
Boston, March 17–20, 2016
Session 43, March 17, Thursday, 3:30 – 5:00 p.m.
The document summarizes a study examining the intermedia agenda-setting effect between Twitter and newspapers on the topic of climate change. The study used content analysis to analyze over 5,000 tweets and 1,000 newspaper articles from eight newspapers in five countries. The results found that newspapers were more influential in setting Twitter's agenda on ongoing discussions, while Twitter had more influence on newspapers' agendas surrounding breaking news. Both media were found to influence each other's agendas. Limitations and opportunities for future research were also discussed.
The impact of sentiment analysis from user on Facebook to enhanced the servic...IJECEIAES
The document summarizes a study that analyzed sentiment from 600 user comments on Facebook to understand how user sentiment impacts Facebook's service quality. The comments were collected from three Facebook posts and analyzed using sentiment analysis tools. The results found 41.50% of comments were negative, 22.83% were neutral, and 35.67% were positive. The study aims to help Facebook understand user sentiment and perceptions to improve their services.
There are numerous ways to analyse the web information, generally web substance are housed in
large information sets and basic inquiries are utilized to parse such information sets. As the requests
expanded with time, mining web information amended to meet challenging task in a web analysis.
Machine learning methodologies are the most up to date one to go into these analysis forms. Different
approaches like decision trees, association rules, Meta heuristic and basic learning methods are embraced
for making web data appraisal and mining data from various web instances. This study will highlight these
approaches in perspective of web investigation. One of the prime goals of this exploration is to investigate
more data mining approaches alongside machine learning systems, and to express emerging collaboration
of web analytics with artificial intelligence.
This document summarizes the experiences of Statistics Netherlands with big data research. It discusses two types of data - primary data collected through surveys and secondary data from administrative sources and big data. It provides examples of big data research conducted using road sensor data, mobile phone data, and social media data. Lessons learned include the need for skills in accessing and analyzing large datasets, dealing with noisy unstructured data, and addressing privacy and costs. Important future research topics mentioned are profiling units in big data, data editing at large scale, and data reduction techniques.
• Created a data warehouse of Women Empowerment and Gender Gap, which gives better suggestions and more details for the prospective gender gap.
• Data collected from 6 different data sources ( both structured and unstructured data) using web scraping.
• Cleaned the dataset using R and automated script to load cleaned datasets and processed (ETL).
• Data was analyzed using Visual Studio and Microsoft SQL tools (SSIS, SSMS, SSAS) to create a data warehouse and deploy cube and finally visualized analysis using Tableau.
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...IJCSEA Journal
In this digital era, social media is an important tool for information dissemination. Twitter is a popular social media platform. Social media analytics helps make informed decisions based on people's needs and opinions. This information, when properly perceived provides valuable insights into different domains, such as public policymaking, marketing, sales, and healthcare. Topic modeling is an unsupervised algorithm to discover a hidden pattern in text documents. In this study, we explore the Latent Dirichlet Allocation (LDA) topic model algorithm. We collected tweets with hashtags related to corona virus related discussions. This study compares regular LDA and LDA based on collapsed Gibbs sampling (LDAMallet) algorithms. The experiments use different data processing steps including trigrams, without trigrams, hashtags, and without hashtags. This study provides a comprehensive analysis of LDA for short text messages using un-pooled and pooled tweets. The results suggest that a pooling scheme using hashtags helps improve the topic inference results with a better coherence score.
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or
political matter at the same time. Nowadays, people connected to those networks are more likely to prefer
to employ themselves utilizing these online platforms to exhibit their standings upon any political
organizations participating in the election throughout the whole election period. The aim of this paper is to
predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian
federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract
the information from the tweet data to count a virtual vote for each corresponding political group. The
original results of the election closely match the findings of our investigation, published by the Australian
Electoral Commission.
The document discusses how the Bhartiya Janta Party (BJP) used social networking sites like Facebook, Twitter, LinkedIn, and Google+ to campaign during the 2014 Indian Lok Sabha elections. It focuses on analyzing BJP's social media strategies, including creating a larger than life image of their leaders like Narendra Modi and concentrating people's attention on a single person. The document also provides statistics on political parties' and candidates' social media presence and expenditures on digital platforms. Key BJP leaders like Narendra Modi had millions of followers on social networking sites and the party leveraged platforms like Google Hangouts to engage voters online.
This paper presents the results of a new monitoring project of the US presidential elections with the aim of establishing computer-based tools to track in real time the popularity or awareness of candidates. The designed and developed innovative methods allow us to extract the frequency of queries sent to numerous search engines by US Internet users. Based on these data, this paper demonstrates that Trump was more frequently searched than the Democratic candidates, either Hillary Clinton in 2016 or Joe Biden in 2020. When analyzing the topics, it is observed that in 2020 the US users had shown a remarkable interest in two subjects, namely, Coronavirus and Jobs (unemployment). Interest for other topics such as Education or Healthcare were less pronounced while issues such as Immigration were given even less attention by users. Finally, some “flame” topics such as Black Lives Matter (2020) and Gun Control (2016) appear to be very popular for a few weeks before returning to a low level of interest. When analyzing tweets sent by candidates during the 2020 campaign, one can observe that Trump was focused mainly on Jobs and on Riots, announcing what would happen if Democrats took power. To these negative ads, Biden answered by putting forward moral values (e.g., love, honesty) and political symbols (e.g., democracy, rights) and by underlying the failure of the current administration in resolving the pandemic situation.
Analyzing Political Sentiment of Indic Languages with TransformersIJCI JOURNAL
This paper presents an analysis of sentiment on Twitter towards the Karnataka elections held in 2023, utilizing transformer-based models specifically designed for sentiment analysis in Indic languages. Through an innovative data collection approach involving a combination of novel methods of data augmentation, online data preceding the election was analyzed. The study focuses on sentiment classification, effectively distinguishing between positive, negative, and neutral posts, while specifically targeting the sentiment regarding the loss of the Bharatiya Janata Party (BJP) or the win of the Indian National Congress (INC). Leveraging high-performing transformer architectures, specifically IndicBERT, coupled with specifically fine-tuned hyperparameters, the AI models employed in this study achieved remarkable accuracy in predicting the the INC’s victory in the election. The findings shed new light on the potential of cutting-edge transformer-based models in capturing and analyzing sentiment dynamics within the Indian political landscape. The implications of this research are far-reaching, providing invaluable insights to political parties for informed decision-making and strategic planning in preparation for the forthcoming 2024 Lok Sabha elections in the nation.
The document summarizes a study examining different generations' views of candidates' use of Twitter during presidential campaigns. Focus groups separated by age range discussed their Twitter usage and opinions on candidates' tweets. Younger participants focused more on candidates' reputations, while older groups discussed policy issues. All agreed candidates need an active social media presence to win elections. Twitter was not a major source of political news for any group, but they saw it as important for reaching young voters.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK dannyijwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24
hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these
social platforms are now part of everyday life. Thus, these social networks have become important sources
to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write
messages about current events, give their opinion on any topic and discuss social issues more and more.
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
Social Networks has become one of the most popular platforms to allow users to communicate, and share their interests without being at the same geographical location. With the great and rapid growth of Social Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content. Thus, the improvement in the information quality and integrity becomes a great challenge to all social media sites, which allows users to get the desired content or be linked to the best link relation using improved search / link technique. So introducing semantics to social networks will widen up the representation of the social networks. In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is based on the concept of multi-agent systems. In this proposed model the representation of social links will be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag Rank based on the indexing phase has improved the output of the ranking. Simulation results of the proposed model have shown improvements in indexing and ranking output.
1. The study analyzed the relationship between social media popularity (Facebook fans and Twitter followers) and polling data of 8 Republican presidential candidates from June to December 2011.
2. The results showed a significant correlation between total Facebook fans and polling numbers, but no correlation with Facebook growth rates. There was also a correlation between total Twitter followers and polling numbers after adjusting for outliers.
3. While social media data cannot replace polling, it provides additional insights and may help predict election trends in near real-time, complementing traditional polling methods. More research is still needed to better understand the dynamics.
The Pessimistic Investor Sentiments Indicator in Social NetworksTELKOMNIKA JOURNAL
This document proposes a method to calculate a pessimistic investor sentiments indicator using social network data. It defines pessimistic investor sentiment as consisting of depression, disappointment, fear, anxiety, panic, dread and despair. The frequency of these sentiments is counted from social media posts. An entropy-based formula is used to calculate the indicator, taking into account expert-assigned weights. Applying this method to Chinese stock market data from March 2016 generated time-series values of the indicator that discriminated sentiment changes more clearly when incorporating the weights. The proposed indicator provides a quantitative measure of pessimistic investor sentiment from social networks.
IRJET - Election Result Prediction using Sentiment AnalysisIRJET Journal
This document proposes a method to predict election results using sentiment analysis of social media data. It involves collecting data from Twitter, Facebook, and Instagram using their APIs. The data will then be preprocessed by removing special characters and URLs. Popular machine learning algorithms like Naive Bayes and SVM will be trained on the preprocessed data to classify tweets as positive, negative, or neutral sentiment toward political parties. The classified tweets will then be analyzed to predict the outcome of elections.
This document describes a system created by researchers to analyze sentiment in tweets about 2012 US presidential candidates in real-time. The system collects tweets through the Twitter API, preprocesses them by tokenizing text, matches tweets to candidates, analyzes sentiment using a model trained on Twitter language, aggregates sentiment results by candidate, and visualizes the analysis through interactive dashboards. It provides up-to-the-minute insight into how public opinion responds to political events as expressed on Twitter.
This document summarizes research on using Google Trends data to forecast the results of the 2012 French presidential election. It begins with a literature review on how the internet has been used in political campaigns, particularly in the US. It also discusses previous research using Google Trends in political analysis and forecasting. The document then applies a "forecasting the present" method developed by Choi and Varian to predict the French election results based on combining daily Google Trends and poll data. Finally, it analyzes the French campaign using Google Trends to identify different phases and how candidates' strategies were impacted by institutional rules. The analysis suggests candidates should use limited resources early to control the campaign agenda before equal airtime rules take effect.
This document outlines a presentation on big data for development (BD4D). It discusses the rise of big data and how BD4D techniques like data analytics can be applied. Potential BD4D applications include healthcare, emergency response, and agriculture. Data sources include mobile phones, crowdsourcing, and social media. The presentation also covers BD4D research in Pakistan using mobile data and challenges like data bias, privacy and causation. Open research areas are suggested to further mitigate challenges and advance predictive and multimodal BD4D analytics.
This document summarizes the results of a survey about social media usage among young Europeans. Key findings include:
- Facebook was the most widely used social network across all countries surveyed. However, some countries had prominent national social networks like StudiVZ in Germany.
- Usage of social networks was nearly universal among respondents except in Romania where only 85% used them.
- Frequency of usage was very high, with most respondents accessing their primary social network multiple times per day or once per day.
- Reasons for not using social networks varied but included lack of interest, perceiving them as useless or a waste of time, and not wanting to develop online relationships.
This document discusses negotiating between digital and analog research methods in political science. It outlines experiences examining Finnish political party platforms and government programs digitally from 1880-2012 and 1917-2015 respectively. Key findings include:
1) Wordfish and structural topic modeling reveal changing emphases over time from "traditional" to "modern" issues in party platforms.
2) Government programs covered nation-building, economic development, and later welfare policy.
3) Digital methods provide new insights but must be carefully integrated with theory and interpretation to be persuasive to traditional political science. Hybrid research combining methods is most promising.
NEGOTIATING THE ANALOG MAINSTREAM WITH DIGITAL METHODS IN HAND VISIONS FROM T...Pertti Ahonen
Dr. Pertti Ahonen, Professor, Faculty of Social
Sciences, University of Helsinki, Finland,
pertti.ahonen@helsinki.fi
ESS Digital Sociology Mini-Conference,
Boston, March 17–20, 2016
Session 43, March 17, Thursday, 3:30 – 5:00 p.m.
The document summarizes a study examining the intermedia agenda-setting effect between Twitter and newspapers on the topic of climate change. The study used content analysis to analyze over 5,000 tweets and 1,000 newspaper articles from eight newspapers in five countries. The results found that newspapers were more influential in setting Twitter's agenda on ongoing discussions, while Twitter had more influence on newspapers' agendas surrounding breaking news. Both media were found to influence each other's agendas. Limitations and opportunities for future research were also discussed.
The impact of sentiment analysis from user on Facebook to enhanced the servic...IJECEIAES
The document summarizes a study that analyzed sentiment from 600 user comments on Facebook to understand how user sentiment impacts Facebook's service quality. The comments were collected from three Facebook posts and analyzed using sentiment analysis tools. The results found 41.50% of comments were negative, 22.83% were neutral, and 35.67% were positive. The study aims to help Facebook understand user sentiment and perceptions to improve their services.
There are numerous ways to analyse the web information, generally web substance are housed in
large information sets and basic inquiries are utilized to parse such information sets. As the requests
expanded with time, mining web information amended to meet challenging task in a web analysis.
Machine learning methodologies are the most up to date one to go into these analysis forms. Different
approaches like decision trees, association rules, Meta heuristic and basic learning methods are embraced
for making web data appraisal and mining data from various web instances. This study will highlight these
approaches in perspective of web investigation. One of the prime goals of this exploration is to investigate
more data mining approaches alongside machine learning systems, and to express emerging collaboration
of web analytics with artificial intelligence.
This document summarizes the experiences of Statistics Netherlands with big data research. It discusses two types of data - primary data collected through surveys and secondary data from administrative sources and big data. It provides examples of big data research conducted using road sensor data, mobile phone data, and social media data. Lessons learned include the need for skills in accessing and analyzing large datasets, dealing with noisy unstructured data, and addressing privacy and costs. Important future research topics mentioned are profiling units in big data, data editing at large scale, and data reduction techniques.
• Created a data warehouse of Women Empowerment and Gender Gap, which gives better suggestions and more details for the prospective gender gap.
• Data collected from 6 different data sources ( both structured and unstructured data) using web scraping.
• Cleaned the dataset using R and automated script to load cleaned datasets and processed (ETL).
• Data was analyzed using Visual Studio and Microsoft SQL tools (SSIS, SSMS, SSAS) to create a data warehouse and deploy cube and finally visualized analysis using Tableau.
ANALYSIS OF TOPIC MODELING WITH UNPOOLED AND POOLED TWEETS AND EXPLORATION OF...IJCSEA Journal
In this digital era, social media is an important tool for information dissemination. Twitter is a popular social media platform. Social media analytics helps make informed decisions based on people's needs and opinions. This information, when properly perceived provides valuable insights into different domains, such as public policymaking, marketing, sales, and healthcare. Topic modeling is an unsupervised algorithm to discover a hidden pattern in text documents. In this study, we explore the Latent Dirichlet Allocation (LDA) topic model algorithm. We collected tweets with hashtags related to corona virus related discussions. This study compares regular LDA and LDA based on collapsed Gibbs sampling (LDAMallet) algorithms. The experiments use different data processing steps including trigrams, without trigrams, hashtags, and without hashtags. This study provides a comprehensive analysis of LDA for short text messages using un-pooled and pooled tweets. The results suggest that a pooling scheme using hashtags helps improve the topic inference results with a better coherence score.
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or
political matter at the same time. Nowadays, people connected to those networks are more likely to prefer
to employ themselves utilizing these online platforms to exhibit their standings upon any political
organizations participating in the election throughout the whole election period. The aim of this paper is to
predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian
federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract
the information from the tweet data to count a virtual vote for each corresponding political group. The
original results of the election closely match the findings of our investigation, published by the Australian
Electoral Commission.
The document discusses how the Bhartiya Janta Party (BJP) used social networking sites like Facebook, Twitter, LinkedIn, and Google+ to campaign during the 2014 Indian Lok Sabha elections. It focuses on analyzing BJP's social media strategies, including creating a larger than life image of their leaders like Narendra Modi and concentrating people's attention on a single person. The document also provides statistics on political parties' and candidates' social media presence and expenditures on digital platforms. Key BJP leaders like Narendra Modi had millions of followers on social networking sites and the party leveraged platforms like Google Hangouts to engage voters online.
This paper presents the results of a new monitoring project of the US presidential elections with the aim of establishing computer-based tools to track in real time the popularity or awareness of candidates. The designed and developed innovative methods allow us to extract the frequency of queries sent to numerous search engines by US Internet users. Based on these data, this paper demonstrates that Trump was more frequently searched than the Democratic candidates, either Hillary Clinton in 2016 or Joe Biden in 2020. When analyzing the topics, it is observed that in 2020 the US users had shown a remarkable interest in two subjects, namely, Coronavirus and Jobs (unemployment). Interest for other topics such as Education or Healthcare were less pronounced while issues such as Immigration were given even less attention by users. Finally, some “flame” topics such as Black Lives Matter (2020) and Gun Control (2016) appear to be very popular for a few weeks before returning to a low level of interest. When analyzing tweets sent by candidates during the 2020 campaign, one can observe that Trump was focused mainly on Jobs and on Riots, announcing what would happen if Democrats took power. To these negative ads, Biden answered by putting forward moral values (e.g., love, honesty) and political symbols (e.g., democracy, rights) and by underlying the failure of the current administration in resolving the pandemic situation.
Analyzing Political Sentiment of Indic Languages with TransformersIJCI JOURNAL
This paper presents an analysis of sentiment on Twitter towards the Karnataka elections held in 2023, utilizing transformer-based models specifically designed for sentiment analysis in Indic languages. Through an innovative data collection approach involving a combination of novel methods of data augmentation, online data preceding the election was analyzed. The study focuses on sentiment classification, effectively distinguishing between positive, negative, and neutral posts, while specifically targeting the sentiment regarding the loss of the Bharatiya Janata Party (BJP) or the win of the Indian National Congress (INC). Leveraging high-performing transformer architectures, specifically IndicBERT, coupled with specifically fine-tuned hyperparameters, the AI models employed in this study achieved remarkable accuracy in predicting the the INC’s victory in the election. The findings shed new light on the potential of cutting-edge transformer-based models in capturing and analyzing sentiment dynamics within the Indian political landscape. The implications of this research are far-reaching, providing invaluable insights to political parties for informed decision-making and strategic planning in preparation for the forthcoming 2024 Lok Sabha elections in the nation.
The document summarizes a study examining different generations' views of candidates' use of Twitter during presidential campaigns. Focus groups separated by age range discussed their Twitter usage and opinions on candidates' tweets. Younger participants focused more on candidates' reputations, while older groups discussed policy issues. All agreed candidates need an active social media presence to win elections. Twitter was not a major source of political news for any group, but they saw it as important for reaching young voters.
IRJET- Sentiment Analysis using Machine LearningIRJET Journal
This document discusses sentiment analysis of social media posts using machine learning. The authors aim to classify social media posts as having either a political or non-political sentiment. A dictionary of keywords and their sentiments is created to analyze posts. Users can make posts that are then classified, and an admin can hide or delete posts containing harmful keywords. Graphs are generated to analyze the classification of posts as political versus non-political and by sentiment. The accuracy of classification depends on the training data and dictionary.
The document discusses the increasing use of digital technologies and social media in political campaigns in India. It provides background on the evolution of political campaigning in India from traditional methods to increasingly sophisticated digital strategies. Key points include:
- Political parties in India have started employing advanced digital tools and techniques in their campaigns, known as "computational politics".
- The 2014 national election saw widespread use of social media by major parties like BJP, AAP, and Congress to engage voters online. BJP had a particularly robust digital campaign.
- The document examines how digitalization and social media have transformed political communication and campaigns in India, both positively by stimulating debate, and potentially negatively by influencing voters.
- A survey
This document proposes a method to quantify the political leaning of Twitter users based on their tweet and retweet activity. It formulates the inference of political leaning as a convex optimization problem that incorporates two ideas: (1) a user's tweets and retweets should be consistent in sentiment, and (2) similar users tend to be retweeted by similar audiences. The method is evaluated on 119 million election-related tweets from the 2012 US presidential election and achieves 94% accuracy in classifying frequently retweeted sources. A quantitative analysis of the tweets also finds that parody accounts and less vocal users are more likely to be liberal, while hashtags usage changes significantly with political events.
LOCATION-BASED SENTIMENT ANALYSIS OF 2019 NIGERIA PRESIDENTIAL ELECTION USING...kevig
Nigeria president Buhari defeated his closest rival Atiku Abubakar by over 3 million votes. He was issued a Certificate of Return and was sworn in on 29 May 2019. However, there were claims of widespread hoax by the opposition. The sentiment analysis captures the opinions of the masses over social media for global events. In this paper, we use 2019 Nigeria presidential election tweets to perform sentiment analysis through the application of a voting ensemble approach (VEA) in which the predictions from multiple techniques are combined to find the best polarity of a tweet (sentence). This is to determine public views on the 2019 Nigeria Presidential elections and compare them with actual election results. Our sentiment analysis experiment is focused on location-based viewpoints where we used Twitter location data. For this experiment, we live-streamed Nigeria 2019 election tweets via Twitter API to create tweets dataset of 583816 size, pre-processed the data, and applied VEA by utilizing three different Sentiment Classifiers to obtain the choicest polarity of a given tweet. Furthermore, we segmented our tweets dataset into Nigerian states and geopolitical zones, then plotted state-wise and geopolitical-wise user sentiments towards Buhari and Atiku and their political parties. The overall objective of the use of states/geopolitical zones is to evaluate the similarity between the sentiment of location-based tweets compared to actual election results. The results reveal that whereas there are election outcomes that coincide with the sentiment expressed on Twitter social media in most cases as shown by the polarity scores of different locations, there are also some election results where our location analysis similarity test failed.
LOCATION-BASED SENTIMENT ANALYSIS OF 2019 NIGERIA PRESIDENTIAL ELECTION USING...kevig
Nigeria president Buhari defeated his closest rival Atiku Abubakar by over 3 million votes. He was issued a
Certificate of Return and was sworn in on 29 May 2019. However, there were claims of widespread hoax
by the opposition. The sentiment analysis captures the opinions of the masses over social media for global
events. In this paper, we use 2019 Nigeria presidential election tweets to perform sentiment analysis
through the application of a voting ensemble approach (VEA) in which the predictions from multiple
techniques are combined to find the best polarity of a tweet (sentence). This is to determine public views on
the 2019 Nigeria Presidential elections and compare them with actual election results. Our sentiment
analysis experiment is focused on location-based viewpoints where we used Twitter location data. For this
experiment, we live-streamed Nigeria 2019 election tweets via Twitter API to create tweets dataset of
583816 size, pre-processed the data, and applied VEA by utilizing three different Sentiment Classifiers to
obtain the choicest polarity of a given tweet. Furthermore, we segmented our tweets dataset into Nigerian
states and geopolitical zones, then plotted state-wise and geopolitical-wise user sentiments towards Buhari
and Atiku and their political parties. The overall objective of the use of states/geopolitical zones is to
evaluate the similarity between the sentiment of location-based tweets compared to actual election results.
The results reveal that whereas there are election outcomes that coincide with the sentiment expressed on
Twitter social media in most cases as shown by the polarity scores of different locations, there are also
some election results where our location analysis similarity test failed.
In the present times, social media is one such platform which has been useful in connecting the people throughout the world. Be it a personal interaction, a product promotion, an advertisement or a political campaign, social media has formed to be the best platform to connect to people globally. In this report the discussion will be focused on the how and why the social media has been used as the medium for political campaigns in offices. The importance of social media for political campaigns will be analyzed and discussed. Thus the research will be focused on the role of social media in political engagement. There will be analysis of how the new age media has increased the possibilities to the ideal situation for political campaign
This paper investigates the factors that influence how members of the Brazilian House of Representatives adopt Twitter as part of their political communication strategy. Our intention is to understand the main variables that lead legislators to invest time and resources in microblogging. This quantitative study examined all of the 457 official congressional accounts registered on Twitter. These accounts were monitored weekly between February 2012 and February 2013. After empirically exploring the data, this work reflects on the results thanks in part to an examination of the literature on the intersection of the Internet and politics. The results indicate correlations between the use of Twitter and attributes like age and occupation of leadership positions.
The effects of social networks as a public relations tool in political commun...Gabriela Olaru
The document summarizes a study that examined how political parties in Turkey use social networks as a public relations tool during political campaigns. It describes the study's methodology, which included examining online traffic, interviewing party representatives, and conducting a web-based survey. The study found that political parties mainly used websites and social media to promote their positions rather than provide information. It also found correlations between social media use and its influence on political decisions and campaigns.
Online media usage for political campaigningThomas Euler
This document provides an abstract for a report investigating the usage of online media for political campaigns in the 2008 US elections. The report analyzes data from two post-election surveys to identify the mindset of political social media users, predict their usage of different online channels, and understand the role of social media in influencing citizens' decisions to volunteer for a campaign. The report aims to develop an understanding of how the internet and social media influence political campaigning and the implications for public relations involved in political campaigns. It recommends topics for future research.
Social Media and West Bengal Political Parties A Brief Analysisijtsrd
The rise of the social networking sites in the early 2000s, has led to the increase in the worlds networked population. Social media ensured that people have greater access to information, content and opportunities to engage in public sphere to undertake united action. Social media has entered into our daily lives and it influences people and organizations all over the world involving many actors regular citizens, social activists, non governmental organizations, telecommunications companies, software service providers, and also governments at large. Social media revolution in Indian politics is real and its impact can be assessed by General elections of 2014 and 2019. The General election of 2014 was regarded as the 1st social media election of India due to the ever increasing use of social media by political actors. Nevertheless, Social media has also impacted politics in all major states including West Bengal. No doubt social media is now being seriously considered by the West Bengal political parties as a mean to reach out to the electorate, but will it influence the 2021 Assembly Elections in the same way as in Obama’s Presidential elections This paper analyses the reach of West Bengal based parties in various social media platforms. Rajarshi Guha "Social Media and West Bengal Political Parties: A Brief Analysis" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-2 , February 2021, URL: https://www.ijtsrd.com/papers/ijtsrd38402.pdf Paper Url: https://www.ijtsrd.com/humanities-and-the-arts/political-science/38402/social-media-and-west-bengal-political-parties-a-brief-analysis/rajarshi-guha
This document provides an introduction and literature review for a research project analyzing the influence of social media on the 2008 and 2012 US presidential campaigns. It begins with an introduction that outlines the research questions and comparative qualitative methodology. The literature review then discusses theories on the role of media in elections, the impact of new technologies like television, and the emergence of social media platforms. It argues that social media expanded the public sphere online and became an important campaign tool, especially for Barack Obama in 2008. The research will examine the role of Facebook, YouTube, and Twitter in storytelling, organizing, and maintaining political engagement in both election cycles.
The document summarizes a study on the impact of social media platforms Twitter and Facebook on the 2015 UK General Election. It utilized a mixed-methods approach including an online survey of 52 participants and interviews. The survey found that most respondents were female, between 16-18 years old, and lived in urban areas of England. Qualitative interviews explored how and why social media may have influenced peoples' votes. The study aimed to understand if social media was a major factor in political campaigns and if any voting patterns emerged in relation to these platforms.
The document analyzes the impact of social media on Malaysia's 2013 general election. It finds that while the ruling Barisan Nasional coalition closed the social media gap with the opposition by 2013, it still saw declining electoral performance compared to 2008. Specifically, Barisan Nasional won fewer seats and a lower popular vote share in 2013 versus 2008, even as its social media presence grew and its leader Najib Razak dominated online. The opposition Pakatan Rakyat coalition improved its seat count and popular vote from 2008 despite having less consolidated social media platforms and leadership branding. Thus, social media's influence on actual voter behavior and election outcomes remains unclear in the Malaysian context.
Politics 2.0: How political parties in Sweden are using social mediaOmer Rosenbaum
This document summarizes a case study on how political parties in Sweden used social media for political communication in 2009. It reviewed the main parties' use of internal platforms like websites and external platforms like social networks. The results showed the more popular Social Democrat and Moderate parties used more social media features than others. However, there were no clear signs social media was an established communication strategy for any party. Future research could track changes over time or analyze one party's social media use in more depth.
Similar to PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA (20)
Identification and Classification of Named Entities in Indian Languageskevig
The process of identification of Named Entities (NEs) in a given document and then there classification into
different categories of NEs is referred to as Named Entity Recognition (NER). We need to do a great effort
in order to perform NER in Indian languages and achieve the same or higher accuracy as that obtained by
English and the European languages. In this paper, we have presented the results that we have achieved by
performing NER in Hindi, Bengali and Telugu using Hidden Markov Model (HMM) and Performance
Metrics.
Effect of Query Formation on Web Search Engine Resultskevig
Query in a search engine is generally based on natural language. A query can be expressed in more than
one way without changing its meaning as it depends on thinking of human being at a particular moment.
Aim of the searcher is to get most relevant results immaterial of how the query has been expressed. In the
present paper, we have examined the results of search engine for change in coverage and similarity of first
few results when a query is entered in two semantically same but in different formats. Searching has been
made through Google search engine. Fifteen pairs of queries have been chosen for the study. The t-test has
been used for the purpose and the results have been checked on the basis of total documents found,
similarity of first five and first ten documents found in the results of a query entered in two different
formats. It has been found that the total coverage is same but first few results are significantly different.
Investigations of the Distributions of Phonemic Durations in Hindi and Dogrikevig
Speech generation is one of the most important areas of research in speech signal processing which is now gaining a serious attention. Speech is a natural form of communication in all living things. Computers with the ability to understand speech and speak with a human like voice are expected to contribute to the development of more natural man-machine interface. However, in order to give those functions that are even closer to those of human beings, we must learn more about the mechanisms by which speech is produced and perceived, and develop speech information processing technologies that can generate a more natural sounding systems. The so described field of stud, also called speech synthesis and more prominently acknowledged as text-to-speech synthesis, originated in the mid eighties because of the emergence of DSP and the rapid advancement of VLSI techniques. To understand this field of speech, it is necessary to understand the basic theory of speech production. Every language has different phonetic alphabets and a different set of possible phonemes and their combinations.
For the analysis of the speech signal, we have carried out the recording of five speakers in Dogri (3 male and 5 females) and eight speakers in Hindi language (4 male and 4 female). For estimating the durational distributions, the mean of mean of ten instances of vowels of each speaker in both the languages has been calculated. Investigations have shown that the two durational distributions differ significantly with respect to mean and standard deviation. The duration of phoneme is speaker dependent. The whole investigation can be concluded with the end result that almost all the Dogri phonemes have shorter duration, in comparison to Hindi phonemes. The period in milli seconds of same phonemes when uttered in Hindi were found to be longer compared to when they were spoken by a person with Dogri as his mother tongue. There are many applications which are directly of indirectly related to the research being carried out. For instance the main application may be for transforming Dogri speech into Hindi and vice versa, and further utilizing this application, we can develop a speech aid to teach Dogri to children. The results may also be useful for synthesizing the phonemes of Dogri using the parameters of the phonemes of Hindi and for building large vocabulary speech recognition systems.
May 2024 - Top10 Cited Articles in Natural Language Computingkevig
Natural Language Processing is a programmed approach to analyze text that is based on both a set of theories and a set of technologies. This forum aims to bring together researchers who have designed and build software that will analyze, understand, and generate languages that humans use naturally to address computers.
Effect of Singular Value Decomposition Based Processing on Speech Perceptionkevig
Speech is an important biological signal for primary mode of communication among human being and also the most natural and efficient form of exchanging information among human in speech. Speech processing is the most important aspect in signal processing. In this paper the theory of linear algebra called singular value decomposition (SVD) is applied to the speech signal. SVD is a technique for deriving important parameters of a signal. The parameters derived using SVD may further be reduced by perceptual evaluation of the synthesized speech using only perceptually important parameters, where the speech signal can be compressed so that the information can be transformed into compressed form without losing its quality. This technique finds wide applications in speech compression, speech recognition, and speech synthesis. The objective of this paper is to investigate the effect of SVD based feature selection of the input speech on the perception of the processed speech signal. The speech signal which is in the form of vowels \a\, \e\, \u\ were recorded from each of the six speakers (3 males and 3 females). The vowels for the six speakers were analyzed using SVD based processing and the effect of the reduction in singular values was investigated on the perception of the resynthesized vowels using reduced singular values. Investigations have shown that the number of singular values can be drastically reduced without significantly affecting the perception of the vowels.
Identifying Key Terms in Prompts for Relevance Evaluation with GPT Modelskevig
Relevance evaluation of a query and a passage is essential in Information Retrieval (IR). Recently, numerous studies have been conducted on tasks related to relevance judgment using Large Language Models (LLMs) such as GPT-4,
demonstrating significant improvements. However, the efficacy of LLMs is considerably influenced by the design of the prompt. The purpose of this paper is to
identify which specific terms in prompts positively or negatively impact relevance
evaluation with LLMs. We employed two types of prompts: those used in previous
research and generated automatically by LLMs. By comparing the performance of
these prompts in both few-shot and zero-shot settings, we analyze the influence of
specific terms in the prompts. We have observed two main findings from our study.
First, we discovered that prompts using the term ‘answer’ lead to more effective
relevance evaluations than those using ‘relevant.’ This indicates that a more direct
approach, focusing on answering the query, tends to enhance performance. Second,
we noted the importance of appropriately balancing the scope of ‘relevance.’ While
the term ‘relevant’ can extend the scope too broadly, resulting in less precise evaluations, an optimal balance in defining relevance is crucial for accurate assessments.
The inclusion of few-shot examples helps in more precisely defining this balance.
By providing clearer contexts for the term ‘relevance,’ few-shot examples contribute
to refine relevance criteria. In conclusion, our study highlights the significance of
carefully selecting terms in prompts for relevance evaluation with LLMs.
Identifying Key Terms in Prompts for Relevance Evaluation with GPT Modelskevig
Relevance evaluation of a query and a passage is essential in Information Retrieval (IR). Recently, numerous studies have been conducted on tasks related to relevance judgment using Large Language Models (LLMs) such as GPT-4, demonstrating significant improvements. However, the efficacy of LLMs is considerably influenced by the design of the prompt. The purpose of this paper is to identify which specific terms in prompts positively or negatively impact relevance evaluation with LLMs. We employed two types of prompts: those used in previous research and generated automatically by LLMs. By comparing the performance of these prompts in both few-shot and zero-shot settings, we analyze the influence of specific terms in the prompts. We have observed two main findings from our study. First, we discovered that prompts using the term ‘answer’ lead to more effective relevance evaluations than those using ‘relevant.’ This indicates that a more direct approach, focusing on answering the query, tends to enhance performance. Second, we noted the importance of appropriately balancing the scope of ‘relevance.’ While the term ‘relevant’ can extend the scope too broadly, resulting in less precise evaluations, an optimal balance in defining relevance is crucial for accurate assessments. The inclusion of few-shot examples helps in more precisely defining this balance. By providing clearer contexts for the term ‘relevance,’ few-shot examples contribute to refine relevance criteria. In conclusion, our study highlights the significance of carefully selecting terms in prompts for relevance evaluation with LLMs.
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers.
Genetic Approach For Arabic Part Of Speech Taggingkevig
With the growing number of textual resources available, the ability to understand them becomes critical.
An essential first step in understanding these sources is the ability to identify the parts-of-speech in each
sentence. Arabic is a morphologically rich language, which presents a challenge for part of speech
tagging. In this paper, our goal is to propose, improve, and implement a part-of-speech tagger based on a
genetic algorithm. The accuracy obtained with this method is comparable to that of other probabilistic
approaches.
Rule Based Transliteration Scheme for English to Punjabikevig
Machine Transliteration has come out to be an emerging and a very important research area in the field of
machine translation. Transliteration basically aims to preserve the phonological structure of words. Proper
transliteration of name entities plays a very significant role in improving the quality of machine translation.
In this paper we are doing machine transliteration for English-Punjabi language pair using rule based
approach. We have constructed some rules for syllabification. Syllabification is the process to extract or
separate the syllable from the words. In this we are calculating the probabilities for name entities (Proper
names and location). For those words which do not come under the category of name entities, separate
probabilities are being calculated by using relative frequency through a statistical machine translation
toolkit known as MOSES. Using these probabilities we are transliterating our input text from English to
Punjabi.
Improving Dialogue Management Through Data Optimizationkevig
In task-oriented dialogue systems, the ability for users to effortlessly communicate with machines and computers through natural language stands as a critical advancement. Central to these systems is the dialogue manager, a pivotal component tasked with navigating the conversation to effectively meet user goals by selecting the most appropriate response. Traditionally, the development of sophisticated dialogue management has embraced a variety of methodologies, including rule-based systems, reinforcement learning, and supervised learning, all aimed at optimizing response selection in light of user inputs. This research casts a spotlight on the pivotal role of data quality in enhancing the performance of dialogue managers. Through a detailed examination of prevalent errors within acclaimed datasets, such as Multiwoz 2.1 and SGD, we introduce an innovative synthetic dialogue generator designed to control the introduction of errors precisely. Our comprehensive analysis underscores the critical impact of dataset imperfections, especially mislabeling, on the challenges inherent in refining dialogue management processes.
Document Author Classification using Parsed Language Structurekevig
Over the years there has been ongoing interest in detecting authorship of a text based on statistical properties of the text, such as by using occurrence rates of noncontextual words. In previous work, these techniques have been used, for example, to determine authorship of all of The Federalist Papers. Such methods may be useful in more modern times to detect fake or AI authorship. Progress in statistical natural language parsers introduces the possibility of using grammatical structure to detect authorship. In this paper we explore a new possibility for detecting authorship using grammatical structural information extracted using a statistical natural language parser. This paper provides a proof of concept, testing author classification based on grammatical structure on a set of “proof texts,” The Federalist Papers and Sanditon which have been as test cases in previous authorship detection studies. Several features extracted from the statisticalnaturallanguage parserwere explored: all subtrees of some depth from any level; rooted subtrees of some depth, part of speech, and part of speech by level in the parse tree. It was found to be helpful to project the features into a lower dimensional space. Statistical experiments on these documents demonstrate that information from a statistical parser can, in fact, assist in distinguishing authors.
Rag-Fusion: A New Take on Retrieval Augmented Generationkevig
Infineon has identified a need for engineers, account managers, and customers to rapidly obtain product information. This problem is traditionally addressed with retrieval-augmented generation (RAG) chatbots, but in this study, I evaluated the use of the newly popularized RAG-Fusion method. RAG-Fusion combines RAG and reciprocal rank fusion (RRF) by generating multiple queries, reranking them with reciprocal scores and fusing the documents and scores. Through manually evaluating answers on accuracy, relevance, and comprehensiveness, I found that RAG-Fusion was able to provide accurate and comprehensive answers due to the generated queries contextualizing the original query from various perspectives. However, some answers strayed off topic when the generated queries' relevance to the original query is insufficient. This research marks significant progress in artificial intelligence (AI) and natural language processing (NLP) applications and demonstrates transformations in a global and multi-industry context.
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...kevig
The common practice in Machine Learning research is to evaluate the top-performing models based on their performance. However, this often leads to overlooking other crucial aspects that should be given careful consideration. In some cases, the performance differences between various approaches may be insignificant, whereas factors like production costs, energy consumption, and carbon footprint should be taken into account. Large Language Models (LLMs) are widely used in academia and industry to address NLP problems. In this study, we present a comprehensive quantitative comparison between traditional approaches (SVM-based) and more recent approaches such as LLM (BERT family models) and generative models (GPT2 and LLAMA2), using the LexGLUE benchmark. Our evaluation takes into account not only performance parameters (standard indices), but also alternative measures such as timing, energy consumption and costs, which collectively contribute to the carbon footprint. To ensure a complete analysis, we separately considered the prototyping phase (which involves model selection through training-validation-test iterations) and the in-production phases. These phases follow distinct implementation procedures and require different resources. The results indicate that simpler algorithms often achieve performance levels similar to those of complex models (LLM and generative models), consuming much less energy and requiring fewer resources. These findings suggest that companies should consider additional considerations when choosing machine learning (ML) solutions. The analysis also demonstrates that it is increasingly necessary for the scientific world to also begin to consider aspects of energy consumption in model evaluations, in order to be able to give real meaning to the results obtained using standard metrics (Precision, Recall, F1 and so on).
Evaluation of Medium-Sized Language Models in German and English Languagekevig
Large language models (LLMs) have garnered significant attention, but the definition of “large” lacks clarity. This paper focuses on medium-sized language models (MLMs), defined as having at least six billion parameters but less than 100 billion. The study evaluates MLMs regarding zero-shot generative question answering, which requires models to provide elaborate answers without external document retrieval. The paper introduces an own test dataset and presents results from human evaluation. Results show that combining the best answers from different MLMs yielded an overall correct answer rate of 82.7% which is better than the 60.9% of ChatGPT. The best MLM achieved 71.8% and has 33B parameters, which highlights the importance of using appropriate training data for fine-tuning rather than solely relying on the number of parameters. More fine-grained feedback should be used to further improve the quality of answers. The open source community is quickly closing the gap to the best commercial models.
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATIONkevig
In task-oriented dialogue systems, the ability for users to effortlessly communicate with machines and
computers through natural language stands as a critical advancement. Central to these systems is the
dialogue manager, a pivotal component tasked with navigating the conversation to effectively meet user
goals by selecting the most appropriate response. Traditionally, the development of sophisticated dialogue
management has embraced a variety of methodologies, including rule-based systems, reinforcement
learning, and supervised learning, all aimed at optimizing response selection in light of user inputs. This
research casts a spotlight on the pivotal role of data quality in enhancing the performance of dialogue
managers. Through a detailed examination of prevalent errors within acclaimed datasets, such as
Multiwoz 2.1 and SGD, we introduce an innovative synthetic dialogue generator designed to control the
introduction of errors precisely. Our comprehensive analysis underscores the critical impact of dataset
imperfections, especially mislabeling, on the challenges inherent in refining dialogue management
processes.
Document Author Classification Using Parsed Language Structurekevig
Over the years there has been ongoing interest in detecting authorship of a text based on statistical properties of the
text, such as by using occurrence rates of noncontextual words. In previous work, these techniques have been used,
for example, to determine authorship of all of The Federalist Papers. Such methods may be useful in more modern
times to detect fake or AI authorship. Progress in statistical natural language parsers introduces the possibility of
using grammatical structure to detect authorship. In this paper we explore a new possibility for detecting authorship
using grammatical structural information extracted using a statistical natural language parser. This paper provides a
proof of concept, testing author classification based on grammatical structure on a set of “proof texts,” The Federalist
Papers and Sanditon which have been as test cases in previous authorship detection studies. Several features extracted
of some depth, part of speech, and part of speech by level in the parse tree. It was found to be helpful to project the
features into a lower dimensional space. Statistical experiments on these documents demonstrate that information
from a statistical parser can, in fact, assist in distinguishing authors.
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONkevig
Infineon has identified a need for engineers, account managers, and customers to rapidly obtain product
information. This problem is traditionally addressed with retrieval-augmented generation (RAG) chatbots,
but in this study, I evaluated the use of the newly popularized RAG-Fusion method. RAG-Fusion combines
RAG and reciprocal rank fusion (RRF) by generating multiple queries, reranking them with reciprocal
scores and fusing the documents and scores. Through manually evaluating answers on accuracy,
relevance, and comprehensiveness, I found that RAG-Fusion was able to provide accurate and
comprehensive answers due to the generated queries contextualizing the original query from various
perspectives. However, some answers strayed off topic when the generated queries' relevance to the
original query is insufficient. This research marks significant progress in artificial intelligence (AI) and
natural language processing (NLP) applications and demonstrates transformations in a global and multiindustry context
Performance, energy consumption and costs: a comparative analysis of automati...kevig
The common practice in Machine Learning research is to evaluate the top-performing models based on their
performance. However, this often leads to overlooking other crucial aspects that should be given careful
consideration. In some cases, the performance differences between various approaches may be insignificant, whereas factors like production costs, energy consumption, and carbon footprint should be taken into
account. Large Language Models (LLMs) are widely used in academia and industry to address NLP problems. In this study, we present a comprehensive quantitative comparison between traditional approaches
(SVM-based) and more recent approaches such as LLM (BERT family models) and generative models (GPT2 and LLAMA2), using the LexGLUE benchmark. Our evaluation takes into account not only performance
parameters (standard indices), but also alternative measures such as timing, energy consumption and costs,
which collectively contribute to the carbon footprint. To ensure a complete analysis, we separately considered the prototyping phase (which involves model selection through training-validation-test iterations) and
the in-production phases. These phases follow distinct implementation procedures and require different resources. The results indicate that simpler algorithms often achieve performance levels similar to those of
complex models (LLM and generative models), consuming much less energy and requiring fewer resources.
These findings suggest that companies should consider additional considerations when choosing machine
learning (ML) solutions. The analysis also demonstrates that it is increasingly necessary for the scientific
world to also begin to consider aspects of energy consumption in model evaluations, in order to be able to
give real meaning to the results obtained using standard metrics (Precision, Recall, F1 and so on).
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGEkevig
Large language models (LLMs) have garnered significant attention, but the definition of “large” lacks
clarity. This paper focuses on medium-sized language models (MLMs), defined as having at least six
billion parameters but less than 100 billion. The study evaluates MLMs regarding zero-shot generative
question answering, which requires models to provide elaborate answers without external document
retrieval. The paper introduces an own test dataset and presents results from human evaluation. Results
show that combining the best answers from different MLMs yielded an overall correct answer rate of
82.7% which is better than the 60.9% of ChatGPT. The best MLM achieved 71.8% and has 33B
parameters, which highlights the importance of using appropriate training data for fine-tuning rather than
solely relying on the number of parameters. More fine-grained feedback should be used to further improve
the quality of answers. The open source community is quickly closing the gap to the best commercial
models.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Literature Review Basics and Understanding Reference Management.pptx
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
1. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
DOI: 10.5121/ijnlc.2020.9103 21
PREDICTING ELECTION OUTCOME FROM SOCIAL
MEDIA DATA
Badhan Chandra Das1
and Md Musfique Anwar1
1
Department of Computer Science and Engineering, Jahangirnagar University,
Bangladesh.
badhan0951@gmail.com
manwar@juniv.edu
ABSTRACT
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or political
matter at the same time. Nowadays, people connected to those networks are more likely to prefer to employ
themselves utilizing these online platforms to exhibit their standings upon any political organizations
participating in the election throughout the whole election period. The aim of this paper is to predict the
outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian federal
election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract the
information from the tweet data to count a virtual vote for each corresponding political group. The original
results of the election closely match the findings of our investigation, published by the Australian Electoral
Commission.
KEYWORDS
Election Prediction, Social Network Analysis, Twitter, Regular Expression.
1. INTRODUCTION
Over the last few years, OSNs have become a reliable and effective medium of communication
and source of important information and news about daily life. OSNs and other micro-blogging
sites are such platforms which not only provides the option to be connected with their friends and
family members, colleagues, acquaintances, fans and their other well-wishers but it also offers the
facility to its users to share different contents like as pictures, texts, videos and others as well.
Approximately, 3.09 billion people are connected to various microblogging platforms including
Facebook, Twitter, Instagram, Google+, etc [15]. These vast number of people frequently use
these online platforms to share their opinion regarding the different affairs of their regular life. In
recent years, these OSNs have appeared one of the popular research fields among many
prominent researchers over the world [1]. Extracting meaningful information utilizing data of
those platforms is a notable task of their research. Twitter is such an online social platform where
anyone can write their thoughts about any issue in 140 characters and spread it among their
connections. Now it has become one of the greatest origins of news, with above 200 million
active users [2].
People share their opinion on various topics, for instance, business, sports, politics, entertainment,
literature and culture and so on. During election season concerned people often post tweets on
Twitter regarding different political parties including their success, failure, manifestos, positive,
negative aspects of their activities or about any specific leader of the corresponding party and also
several other stuff. The activity of voters on micro-blogs reflect their sentiment to any specific
2. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
22
political group. Their tweets regarding any political party can be in favour of them or against
them. These micro-blogs often play a significant role not only in terms of understanding of public
intuition towards any party taking part in the election but also promoting those political groups.
Having an aim of winning the election, it is very essential to prepare a manifesto and set the
strategy to boost up promotional activities in accordance with the demand of mass people.
Oftentimes, the turnout of an election can be predicted approximately from these OSN posts by
analyzing the texts of that post on the basis of various relevant parameters. That is why
forecasting election outcomes is one of the major and trending tasks of the eminent researchers of
text processing. In this paper, we introduce a very simple and straightforward technique of
predicting election results based on tweets counts. We have taken into count those tweets, which
contain political party names, abbreviations of the party name, any of their leaders' names, the
moto of that party and all possible ways by which any corresponding party can be mentioned so
that the calculation of estimation of election produces higher accuracy. In brief, the basic
approaches of our contributions are as following:
We propose an easy and novel method of counting system by two-way extraction of the
virtual support from twitter mentions of political parties during the election season.
We perform a comprehensive experiment on a real-world Twitter dataset collected from
Kaggle [16] containing 1.8 million tweets collected throughout the election period.
We explain the techniques followed for recognizing the mention of any specific party in a
particular tweet and also present the comparison of the result of our predicted outcomes
with the original results of the election.
The remainder of this paper consists of as follows. We describe some relevant works already
done in this field in Section 2. In Section 3, we explain the formation of political parties. Section
4 will contain the methodologies used to perform the experiment and the computational equations
in order to calculate the election outcome. Section 5 will describe the validation of the
experimental results of forecasting as well as make some discussion concerning the outcomes of
our presented prophecy. Lastly, the paper puts an end with the conclusion in section 6.
2. RELATED WORKS
Among a bunch of research fields in Text Mining and Natural Language Processing (NLP),
forecasting election results, as well as analysis of the public concern on politics, have able to seek
the attention of renowned researchers over the world. Earlier works reported some notable works
in this field. Besides, political parties are always eager to discover public opinion about them,
hence, research on this specific arena has become really essential [3].
Basically, previous investigations have been explored this field from several points of view as
follows.
2.1. Analyzing Public Perspective from Social Media
Karami et. al. [4] reported an approach of extracting public opinion in order to explore the
discussion of the economic issue during the election period 2012 US presidential election on
Twitter. Another Twitter dataset was employed to study public opinion in the case of UK General
Election 2010 [5]. Studies have done on predicting the political preferences of users of twitter
networks by their online actions [7]. Apart from political issues public opinion mining tasks have
been done for several other purposes e.g. decision making for business strategy [17] [18], Kim et.
al. have shown how reviews of social media can be a determinant of improvement of restaurants’
3. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
23
performance [19]. Gadekallu et. al. worked on an IMDB data and showed applications of
sentiment analysis e.g. determining marketing strategy, improving campaign success, etc [20].
2.2. Election Prediction Using Sentiment Analysis
Sentiment analysis is one of the popular manners of predicting elections among researchers. This
approach has been used by the researchers in order to predict elections of several countries. For
instance, Indonesia Presidential election [9], The Nigeria Presidential Election 2019 [8], French
Election [13]. In accordance with that, Heredia et. al. [3] proposed a prediction technique
incorporating location-based sentiment analysis of U.S. Presidential Election 2016. Oyebode et.
al. [8] claim a new lexicon-based approach named VADER-EXT which outperforms two other
sentiment lexicon library-based approaches known as VADER and Textblob.
2.3. Mixed Approaches
Karami et. al. [4] combined both sentiment analysis and topic modeling approach to find out
meaningful information from tweets. Heredia et. al. [6] reported an approach of predicting
election of a mixed approach which includes both polling and sentiment analysis. Unankard et. al.
[10] introduced a technique of forecasting the election result by associating both sub-event and
sentiment analysis. Sanders et. al. [12] in 2016, used demographic information to predict Dutch
election results, again, in 2019, asserted that, demographic statistics do not really help in
forecasting the elections and proposed a simple tweet count method for doing so [11]. Sanga et.
al. proposed a Bayesian network model on Indian general election, 2019 [21].
All the approaches of forecasting the turnout of election cited above are a bit complex and time
consuming since they need to employ several lexicon libraries and probabilistic theorem. Our
proposed method consists of a very simple and effective procedure for performing the above task.
3. FORMATION OF AUSTRALIAN POLITICAL PARTIES
In this paper, we are about to report an investigation on a dataset of Australian federal election
2019 which took place on 18th May 2019. Several parties participated in the election in order to
form the government. A group of parties often make an alliance between them to take part in
elections as well as forming the government and for working together later. Making such
collaboration between political groups like this is generally known as coalition. The Australian
federal election has also seen such a coalition. Table 1. shows the names of the major parties,
coalitions, and their leader names.
Table 1. Major Political Parties, their Coalitions and the leaders
4. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
24
4. PROPOSED METHODOLOGY
4.1. Counting Procedure of Tweets
A huge amount of Twitter dataset which contains around 1.8 Million tweets collected from 10
May 2019 to 20 May 2019. As the election date comes near, the frequency of posting tweets
regarding election increases rapidly. A political party in a tweet can be mentioned in various
ways. Sometimes a party name may be a bit large, besides, writing the full name with the proper
spelling can be daunting as well as time-consuming. This is why referring the full name of any
political party in a tweet is not convenient always. Moreover, it is not mandatory to mention the
full official name of any organization on such a casual online platform like Twitter. Similarly,
from our observation of the dataset, we noticed that most of the users who posted about any
political party on Twitter did not write the complete formal name of the party always. People
commonly used acronyms, a portion of the name by which a party can be recognized, the name of
the leader of the corresponding party or any philosophy, ideology or doctrine of that respective
political group to espouse them. So, determining the tweets from a huge amount of data contains
names of the specific political wing, needs a generalized method that can extract only those data
pertaining to analogous information regarding that party.
Considering all those points above, we have chosen the following approaches to construct the
counting process more efficient.
4.1.1. Regular Expressions for Party Mentions
Regular Expression is such a technique to find any specific or any conditional pattern, characters,
numbers or symbols in a text. Utilizing the above technique to identify the party names in the
tweets by taking almost all possible cases into account through which a party can be cited in an
instance of a tweet. For example, the regular expression for the first instance of Table 2 will
detect the mention of the party Liberal National Coalition as Liberal National Coalition, LNC,
L.N.C., Liberal Party and all other possible cases. The Regular Expressions we employed in order
to find party mentions shown in the following table.
Table 2. Regular Expressions to find party mentions in tweets
5. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
25
4.1.2. Search for Specific Relevant Keywords
Sometimes it is seen that mass public quote the popular morals, taglines, ideologies or the name
of the leader or the founder of any political group to show their enthusiasm towards the party. As
a result, any generalized strategy is unable to recognize these types of implicit mentions in the
tweets. Taking this issue into account, in order to resolve this problem, we manually searched
according to the above-mentioned keys in the tweets. We engaged str.contains(‘’) function
offered by Pandas library in python version 3.7.
After having outputs from both operations, we aggregate both of the outputs to perform the
computational experiments to get the prediction result.
4.2. Parameters Consideration
i. We have ignored the location, occupation and other demographic features of twitter users
since this paper introduces a very simple approach to forecast election results.
ii. Removal of redundant instances and elimination of stopwords has been done at the early
stage of the experiment.
iii. Retweets and comments were not taken into counts because retweeting not denote the
support for the original tweet post every time. Along with this, URLs in a tweet don't
hold much information in terms of advocation for any particular group. For this reason,
URLs also get outweighed our attention to be counted as a parameter.
iv. As we stated earlier, the tendency of tweeting regarding election rises considerably when
the date of election comes closer. Figure 1. demonstrates this statement graphically,
where X-axis denotes the date of the month of May 2019 and the Y-axis indicates the
number of tweets.
v. Our consideration was to measure the mentions for the top three parties received 96.7%
votes in the actual election held on 18th May 2019 in Australia.
Figure 1: Graphical representation of party mentions during election period
6. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
26
4.3. Average Tweets Count and Percentage
Let MC(X)(i) be the counts for mentions on behalf of party X on the day i and N is the sum of the
day counts. Un is the number of unique users tweets in favor of party X, then, the average tweet
counts per user mention among all the tweets for party X is,
Where ATCP(X) denotes the average tweets counts per user mention in the tweet for party X,
If the number of political parties is P, then the party X acquires the percentage of mentions
denoted by PREC(X) shown in Eq. 2. Where the numerator is computed after applying ceiling
function on the values yielding from Eq. 1 and denominator represents the summation of all
average tweet counts for P parties. Ceiling function of x is the value gives the smallest integer
which is greater than or equal x. For example, Ceil(3.5) will provide 4 as output.
5. EXPERIMENTAL RESULT AND DISCUSSION
The evaluation of our proposed framework has been conducted by two of the most common and
well recognized metrics named Mean Absolute Error (MAE) and Root Mean Squared Error
(RMSE), which are frequently used to measure accuracy for continuous variables.
Mean Absolute Error (AE) it is a term which determines the measurement of flaws in the
assessment compared with original values. It is also known as Absolute Accuracy Error (AAE).
MAE is the average of all AEs. In our proposed context, PRECel(i) and PRECmc(i) refer to the
percentage of vote earned by party i in the election and the percentage of the mention counts for
party i among the tweets through our suggested system.
Root Mean Squared Error (RMSE) is known as a measurement of standard deviation which
denotes how much spread out the predicted value is from the real value. Usually, this procedure is
appropriate for determining the standard deviation of the residuals. Residuals are referred to the
prediction errors which means how distant the regression line is from the actual data points. Eq. 4
shows how RMSE is calculated.
The implicit indication behind the above discussion is that the less the value of these two metrics
the greater the accuracy of the framework will be.
The comparison between the percentages of our predicted outcome and the real election result
represented graphically in Figure 2. We can see clearly that, the flaw of our investigation is very
7. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
27
little for the top two parties according to the real turnout of the election that took place on 18th of
may [14].
Figure 2: The comparison between the percentage of forecast from tweets and votes of actual election for
three major political parties
The correctness of any system grows higher as the value of MAE and RMSE drops and the value
of RMSE will always be greater than the value of MAE. Our experimental result follows these
ground rules of the two above mentioned accuracy metrics. The value of the MAE and RMSE of
the experimental result is shown in Table 3.
Table 3. The Result of Two Evaluation Metrics with respect to the main election
6. CONCLUSION
In this paper, we meticulously examined a dataset of Australian federal election 2019 and made a
forecast of the percentage of votes gained by each party. The counting procedure followed two
useful manners to extract party mentions in an instance of a tweet. The obtained outcome yields
our proposed approach closely matches the actual percentage count of the election and gives a
little error. The evaluation of the prediction outcomes has executed by two most well-known
accuracy metrics.
8. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
28
REFERENCES
[1] Franch, Fabio: (Wisdom of the Crowds) 2: 2010 UK election prediction with social media. In: Journal
of Information Technology & Politics, v.10, n.1, pp. 57-71, 2013.
[2] Wang, Lei, and Gan, John Q: Prediction of the 2017 French election based on Twitter data analysis.
In: 2017 9th Computer Science and Electronic Engineering (CEEC), pp. 89-93, 2017.
[3] Heredia, Brian and Prusa, Joseph D and Khoshgoftaar, Taghi M: Location-based twitter sentiment
analysis for predicting the U.S. 2016 presidential election. In Thirty-First International Florida
Artificial Intelligence Research Society Conference (FLAIRS-31), 2018.
[4] Karami, Amir and Bennett, London S and He, Xiaoyun: Mining public opinion about economic
issues: Twitter and the U.S. presidential election. In: International Journal of Strategic Decision
Sciences (IJSDS), v.9, n.1, pp. 18-28, 2018.
[5] Anstead, Nick and O'Loughlin, Ben: Social media analysis and public opinion: The 2010 UK general
election. In: Journal of Computer-Mediated Communication, v.20, n.2 pp. 204-220, 2015.
[6] Heredia, Brian and Prusa, Joseph D and Khoshgoftaar, Taghi M: Social media for polling and
predicting United States election outcome. In: Social Network Analysis and Mining, v.8, n.1, pp. 48-
64, 2018
[7] Makazhanov, Aibek and Rafiei, Davood and Waqar, Muhammad: Predicting political preference of
Twitter users, In: Social Network Analysis and Mining, v.4, n.1, pp. 193, 2014.
[8] Oyebode, Oladapo and Orji, Rita: Social Media and Sentiment Analysis: The Nigeria Presidential
Election 2019, In: 10th Annual Information Technology, Electronics and Mobile Communication
Conference (IEMCON), pp. 140-146, 2019
[9] Budiharto, Widodo and Meiliana, Meiliana: Prediction and analysis of Indonesia Presidential election
from Twitter using sentiment analysis, In: Journal of Big data, v.5, n.1, pp. 51, 2018
[10] Unankard, Sayan and Li, Xue and Sharaf, Mohamed and Zhong, Jiang and Li, Xueming: Predicting
elections from social networks based on sub-event detection and sentiment analysis, In: International
Conference on Web Information Systems Engineering, pp. 1-16, 2014
[11] Sanders, Eric and van den Bosch, Antal: A Longitudinal Study on Twitter-Based Forecasting of Five
Dutch National Elections, In: International Conference on Social Informatics, pp. 128-142, 2019
[12] Sanders, Eric and de Gier, Michelle and van den Bosch, Antal: Using demographics in predicting
election results with Twitter, In: International Conference on Social Informatics, pp. 259-268, 2016.
[13] Wang, Lei and Gan, John Q: Prediction of the 2017 French election based on Twitter data analysis,
In: 9th Computer Science and Electronic Engineering (CEEC), pp. 89-93, 2017.
[14] AustralianFederal Election Result Party Totals 2019:
https://www.abc.net.au/news/elections/federal/2019/results/party-totals [Last Accessed 28 Jan. 2020]
[15] Number of social network users: https://www.statista.com/statistics/278414/number-of-worldwide-
social-network-users/ [Last Accessed 28 Jan. 2020]
[16] Australian Election 2019 Tweets: https://www.kaggle.com/taniaj/australian-election-2019-tweets
[Last Accessed 28 Jan. 2020]
[17] Shen, Chien-Wen and Ho, Jung-Tsung: Public Opinion Toward Social Business from a Social Media
Perspective, In: International Conference on Data Mining and Big Data, pp. 555-562, 2018.
9. International Journal on Natural Language Computing (IJNLC) Vol.9, No.1, February 2020
29
[18] He, Wu and Wang, Feng-Kwei and Akula, Vasudeva: Managing extracted knowledge from big social
media data for business decision making, In: Journal of Knowledge Management, 2017.
[19] Kim, Woo Gon and Li, Jun Justin and Brymer, Robert A: The impact of social media reviews on
restaurant performance: The moderating role of excellence certificate, In: International Journal of
Hospitality Management, v.55, pp. 41-51, 2016.
[20] Gadekallu, ThippaReddy and Soni, Akshat and Sarkar, Deeptanu and Kuruva, Lakshmanna:
Application of Sentiment Analysis in Movie reviews, In: Sentiment Analysis and Knowledge
Discovery in Contemporary Business, pp. 77-90, 2019.
[21] Sanga, Aniruddh and Samuel, Ashirwad and Rathaur, Nidhi and Abimbola, Pelumi and Babbar,
Sakshi: Bayesian Prediction on PM Modi’s Future in 2019, In: Proceedings of International
Conference on Recent Innovations in Computing, pp. 885-897, 2020.
AUTHORS
Badhan Chandra Das is an M.Sc. student of Computer Science and Engineering,
Jahangirnagar University, Savar, Dhaka, Bangladesh. He has completed his B.Sc.
also in Computer Science and Engineering, from Jahangirnagar University in 2017.
His research is concentrated on Data Mining, Social Network Analysis, Natural
Language Processing and developing Machine Learning models for Intelligent tools.
He has several publications on international conferences. Currently, he is working on
a funded research project of University Grants Commission (UGC) of Bangladesh.
Md Musfique Anwar has awarded PhD degree from the Department of Computer
Science and Software Engineering, Faculty of Science, Engineering and Technology
of Swinburne University of Technology, Melbourne, Australia in 2018. He has
received M.Sc. degree from the Department of Intelligence Science and Technology,
Graduate School of Informatics of Kyoto University, Japan in 2013 and B.Sc. degree
in Computer Science and Engineering from Jahangirnagar University, Savar, Dhaka,
Bangladesh in 2006. Since 2008, he is a faculty member having current designation
Associate Professor in the Department of Computer Science and Engineering of Jahangirnagar University,
Savar, Dhaka, Bangladesh. Currently, his research focuses on Data Mining, Social Network Analysis,
Natural Language Processing and Software Engineering. He achieved Best Student Paper award in 29th
Australasian Database Conference (ADC) in 2018, Best Poster award in 26th Australasian Database
Conference (ADC) in 2015 and Best Poster Paper award in International Workshop on Computer Vision
and Intelligent Systems-2019 (IWCVIS2019) in 2019.