Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions

295 views

Published on

We offer free and open access to two core resources for social media and journalism research:

1) A data collection and enrichment pipeline which allows to generate custom datasets that include both news content and the social interactions over them, starting from a pre-defined set of news sources.

2) A dataset which includes news articles from major U.S. news outlets and associated sharing activities on Twitter, covering the tweets content and the author profiles.

First, we build a robust pipeline for collecting datasets describing news sharing; the pipeline takes as input a list of news sources and generates a large collection of articles, of the accounts that provide them on the social media either directly or by retweeting, and of the social activities performed by these accounts.

The dataset is published on Harvard Dataverse:

https://doi.org/10.7910/DVN/5XRZLH

Second, we also provide a large-scale dataset that can be used to study the social behavior of Twitter users and their involvement in the dissemination of news items. Finally we show an application of our data collection in the context of political stance classification and we suggest other potential usages of the presented resources.

The code is published on GitHub:

https://github.com/DataSciencePolimi/NewsAnalyzer

The details of our approach is published in a paper at ICWSM 2019 accessible online on AAAI library:
https://aaai.org/ojs/index.php/ICWSM/article/view/3256

You can cite the paper as:

Giovanni Brena, Marco Brambilla, Stefano Ceri, Marco Di Giovanni, Francesco Pierri, Giorgia Ramponi. News Sharing User Behaviour on Twitter: A Comprehensive Data Collection of News Articles and Social Interactions. AAAI ICWSM 2019, pp. 592-597.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions

  1. 1. News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions Giovanni Brena, Marco Brambilla, Stefano Ceri, Marco Di Giovanni, Francesco Pierri, Giorgia Ramponi Politecnico di Milano @marcobrambi marco.brambilla@polimi.it http://datascience.deib.polimi.it/
  2. 2. • 69 news sources (online newspapers and news agencies) • 8 news categories • 300K news articles • 1MLN tweets sharing the news • 40K user profiles authoring the tweets • On dataverse.harvard.edu: https://doi.org/10.7910/DVN/5XRZLH TheDataset
  3. 3. • Articles details: source, category, topic TheDataset
  4. 4. • Enriched user profile: age, sex, ethnicity, geo-localization and Twitter profile • Twitter network of the user: followers, friends, mentions, and retweets TheDataset
  5. 5. • Sharing behaviour analysis: sharing and reaction stats, vocabulary and most frequent hashtags TheDataset
  6. 6. TheDataCollection&EnrichmentPipeline Implementation available on GitHub: https://github.com/DataSciencePolimi/NewsAnalyzer
  7. 7. • Prediction of U.S. 2018 Midterm election results by state based on classification of news sharing behaviour • Rep. vs. Dem. Affiliation achieved 90% accuracy at user level • Only 5 states wrongly predicted results WhatdoIuseitfor? …Anythingyouwant
  8. 8. Marco Brambilla @marcobrambi marco.brambilla@polimi.it Politecnico di Milano, Italy http://datascience.deib.polimi.it/ News Sharing User Behaviour on Twitter: a Comprehensive Data Collection of News Articles and Social Interactions

×