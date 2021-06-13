Last Tuesday, 8th June 2021 is a day that makes me proud. On this day, my team has presented our project named Twitter Sentiment Analysis on Dogecoin. As a full-time programmer on this project, I was pretty sure that this project is going to be interesting.

The first step to do was to collect the data. We used twint library on environment Google Colaboratory to scrap the tweets data from 1st June 2021 until 5th June 2021 and R programming language to process the data. Then we performed preprocessing steps including transform all the tweets into lower case, remove the username, punctuation, HTML links, hashtags, etc. We also removed the English stopwords and made a barplot that gives us the top 5 most frequent words in all tweets. Wordcloud also has been generated to intuitively show some words based on the frequency.

The next step was calculating the sentiment scores. We used the NRC Emotion Lexicon algorithm, which is a list of English words and their associations with 2 sentiments (positive, negative) and 8 basics human emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) proposed by Saif Mohammad. What happened next was amazing.

As Data Mining purpose to uncovering patterns from the data, we found that emotions like anger, fear, and sadness have associations with the Dogecoin chart price. As the Dogecoin price goes down after 3rd June 2021, the score for that three emotions goes up. It means that when the price fall, the numbers of angry, fearful, and sad tweets on Twitter are increasing. We also made the line chart close price vs tweets volume that indicating when the price goes down, the number of tweets about Dogecoin is raising.

I hope this post will bring an insightful perspective to whoever sees this post. The code is coming soon my Github. Happy Data Mining!