Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Topic Modeling to Study Everyday "Civic Talk" and Proto-political Engagements

448 views

Published on

We present a two-step topic modeling method of analysing political articulations in everyday proto-political "civic talk" on online social media and interpreting them in terms of cultural and political sociology.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Using Topic Modeling to Study Everyday "Civic Talk" and Proto-political Engagements

  1. 1. Using Topic Modeling to Study Everyday “Civic Talk” and Proto-political Engagements Veikko Eranti &TuukkaYlä-Anttila Universities of Helsinki &Tampere “Citizens in the Making” (Kone Foundation 2015–2017) blogs.uta.fi/cim @VeikkoEranti @TuukkaYlaAnt
  2. 2. Background • A larger project combining ethnographic and digital methods • Citizenship as action, as process – “grown into” • Our subfield: online proto- politics and politics • How does everyday “civic talk” get articulated and raised onto the level of political discourse, participation? • Proof-of-concept empirical analysis of a discussion forum dataset
  3. 3. Materials • Project: several social media datasets • Here: Suomi24 (Finland24) forum • Subset of 2.5M words (whole 2001–2015 dataset: 2.5B words) • A general interest forum, largest of its kind in Finland • Sub-forums: local municipalities, cars, hobbies, home & DIY, pets, travel, Jesus, sex, and Jesus & sex • Dedicated sections for political discussion, but it also “leaks” to other discussion areas • We look at proto-political talk on the forum as a whole
  4. 4. Theory • Online political talk not an ideal Habermasian speech situation or public sphere • Not necessarily political arguments: grievances, expressions of resentment... below the threshold of argumentation and deliberation (Mouffe,Young, Habermas, Laclau, Dahlgren,Thévenot, Klofstadt) • Working hypothesis: articulation of grievances (and bigger idea of civic culture) reflected in pre- /proto-political discussions
  5. 5. Methods • Topic modeling: unsupervised machine learning • Takes text, gives you “topics”: sets of words that occur together in documents • Can frames, discourses, justifications etc. objects of cultural sociology be operationalized as such topics? (DiMaggio, Nag & Blei 2013) • We run a 50-topic LDA model with MALLET to find (proto)political talk in everyday debates • 50 sets of words which often occur together: topics of discussion
  6. 6. Examples of topics (top 10 words) topic17: new need Finland through produce change problem build small action future use nowadays opportunity option topic23: Finland Sweden language church Finnish Swedish speak school country learn Catholic religion belong study Islam topic32: Finland pay Euro money tax billion state million poor cut government economy rich count large
  7. 7. Interpreting topics • These were political words, but don’t really represent a political articulation (a position, a justification or even a policy theme) • We interpret 9 of 50 topics as political or proto-political • How to get closer to political articulations from this general “civic talk”? • Let’s pick “proto-political” topics from the 50 and reduce the dataset to the 100 most important messages from each • Reduced to 827 messages (from ~42 000) • 30-topic LDA model on them
  8. 8. But first… an aside on VALIDATION of interpretations • This is a proof-of-concept, so we validated these very superficially • In actual work…VALIDATE,VALIDATE,VALIDATE! • Context-specific deep knowledge of your data – read it! • Internal validation, external validation (Evans 2014, Grimmer & Stewart 2013) • ICCSS2015 poster: more systematic validation
  9. 9. Examples of topics in “submodel” topic3: Marx work workingclass capitalism teacher socialism worker create pay workingtime value long wellbeing production product topic12: Finland Niinistö parliament Soini president TrueFinn party Halla-aho choose minister leader chairman foreignminister memberofparliament Russia topic22: member association function union expel organization name right important only Halonen membershipfee forum join DDR
  10. 10. 21 of 30 topics are rather clear political articulations! Example:
  11. 11. Conclusions • 50-topic model of a general interest forum: no or vaguely political articulations • However, “proto-political” discussions as reduced dataset produces much more coherent articulations • Locating proto-political talk in big data and then, further, pinpointing political articulations arising from that • Drawing a map of big datasets for further qualitative exploration • Sub-model topics are still largely thematic instead of practices, frames, justifications etc. • Can we get at these through vocabulary? • Note: this demo was 1/1000 of the entire Suomi24 dataset • Importance of theory and conceptual work
  12. 12. Extra idea Could we model 1) fringe forums, 2) “mid-level” forums and 3) general forums/media to plot the emergence, spreading and mainstreaming of articulations?
  13. 13. References • Dahlgren, Peter. 2000. “The Internet and the Democratization of Civic Culture.” Political Communication 17: 335–40. • DiMaggio, Paul, Manish Nag, and David M. Blei. 2013. “ExploitingAffinities betweenTopic Modeling and the Sociological Perspective on Culture: Application to Newspaper Coverage of U.S. Government Arts Funding.” Poetics 41(6): 570–606. • Evans, Michael S. 2014. “A Computational Approach to Qualitative Analysis in LargeTextual Datasets.” PLoS ONE 9(2): 1–10. • Grimmer, Justin, and Brandon M. Stewart. 2013. “Text as Data:The Promise and Pitfalls of Automatic Content Analysis Methods for PoliticalTexts.” Political Analysis 21(3): 267–97. • Klofstad, Casey A. 2011. CivicTalk: Peers, Politics, and the Future of Democracy. Temple University Press. • Thévenot, Laurent. 2014. “VoicingConcern and Difference: From Public Spaces to Common-Places.” EuropeanJournal of Cultural and Political Sociology 1(1): 7– 34. • Etc.

×