Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Content-based Classification of Political Inclinations of Twitter Users


Published on

Social networks are huge continuous sources of information that can be used to analyze people's behavior and thoughts.

Our goal is to extract such information and predict political inclinations of users.

In particular, we investigate the importance of syntactic features of texts written by users when they post on social media. Our hypothesis is that people belonging to the same political party write in similar ways, thus they can be classified properly on the basis of the words that they use.

We analyze tweets because Twitter is commonly used in Italy for discussing about politics; moreover, it provides an official API that can be easily exploited for data extraction. Many classifiers were applied to different kinds of features and NLP vectorization methods in order to obtain the best method capable of confirming our hypothesis.

To evaluate their accuracy, a set of current Italian deputies with consistent activity in Twitter has been selected as ground truth, and we have then predicted their political party. Using the results of our analysis, we also got interesting insights into current Italian politics.

Our study is described in detail in the paper published in the IEEE Big Data 2018 conference and linked at:

DOI: 10.1109/BigData.2018.8622040

Published in: Social Media
  • Be the first to comment

  • Be the first to like this

Content-based Classification of Political Inclinations of Twitter Users

  1. 1. Content-based Classification of Political Inclinations of Twitter Users Marco Di Giovanni, Marco Brambilla, Stefano Ceri Florian Daniel, Giorgia Ramponi 10 December 2018ABCSS2018 @ IEEE BigData 2018 Politecnico di Milano
  2. 2. Goal: • Classify political inclinations of Twitter users using the content in their tweets • Do users write in a similar way based on their political inclination? • Problem: we don’t have a groundtruth! • Select users with known political inclination: Politicians
  3. 3. Italian Politics Chamber of deputies: 630 total deputies
  4. 4. Twitter
  5. 5. Pipeline 188 Twitter accounts 30643 tweets collected
  6. 6. Best Pipeline Result: politicians write in similar ways based on their political party
  7. 7. T-SNE From about 7900 to 2 dimentions Silhouette score: 0.01
  8. 8. Confusion matrix How accounts are classified
  9. 9. Relevant words
  10. 10. Relevant words
  11. 11. - Mosque - Immigrant - Party Topic detection (LSA) - Citizen - Video - Appointment Common words between topics: - Law - Chamber - Minister
  12. 12. Future work • Use more advanced vectorization techniques such as word2vec • Apply to account of italian non politicians • Integrate with other techniques, such as Network analysis
  13. 13. Thank you