Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fail ir16 intro


Published on

Introduction for the second workshop "#FAIL! Things that didn't work out in social media research - and what we can learn from them". Workshop at #ir16 conference, Phoenix, October 21st, 2015

Published in: Social Media
  • Be the first to comment

  • Be the first to like this

Fail ir16 intro

  1. 1. #FAIL! THINGS THAT DIDN‘T WORK OUT IN SOCIAL MEDIA RESEARCH - AND WHAT WE CAN LEARN FROM THEM Workshop at Internet Research 16, Phoenix, October 21st, 2015.
  2. 2. • Workshop hashtag: #fail2015b • Conference hashtag: #ir16 • Workshop website: • Etherpad: WELCOME Luca Rossi @LR Karine Nahon @karineb Katrin Weller @kwelle
  3. 3. ABOUT #FAIL! WORKSHOPS • Traveling on to different conferences. First workshop was at WebSci15 (June 2015) • Aim: collect various examples for things that can go wrong and share them with different communities learn from experiences Connect different research communities
  5. 5. 0 100 200 300 400 500 600 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Twitter Facebook YouTube Blogs Wikis Foursquare LinkedIn MySpace Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus Title Search. For details: SOCIAL MEDIA RESEARCH
  6. 6. 2008-2013 papers on Twitter and elections: data sources Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 6 Data source number No information 11 Collected manually from Twitter website (Copy-Paste / Screenshot) 6 Twitter API (no further information) 8 Twitter Search API 3 Twitter Streaming API 1 Twitter Rest API 1 Twitter API user timeline 1 Own program for accessing Twitter APIs 4 Twitter Gardenhose 1 Official Reseller (Gnip, DataSift) 3 YourTwapperKeeper 3 Other tools (e.g. Topsy) 6 Received from colleagues 1 SOCIAL MEDIA RESEARCH
  7. 7. What we discussed at the first workshop… CURRENT PROBLEMS
  8. 8. Challenge 1: users • How to involve social media users in the research process? • Presentation by Elodie Crespel: “Extending data collection with web browser extension” – Participants may be creative in their use of technology – flexibility is needed.
  9. 9. Challenge 2: methods • Data analytics: which approaches should be chosen? • Taha Yasseri: “The double-edged sword of statistical significance” – Questions the p-value as a standard for data analytics. – “too much of attention and reliance on specific measures or methods without being aware of the logic behind them, can be misleading”
  10. 10. Challenge 3: tools • Many researchers use third party tools for data collection or analysis – which may not always work as expected. • Presentation by Michael Bossetta and Anamaria Dutceac Segesten: “Tracing Eurosceptic Party Networks via Hyperlink Network Analysis and #FAIL!ng: Can Web Crawlers Keep up with Web Design?” – Exemplary case: issuecrawler.
  11. 11. Challenge 4: content • Content analysis is heavily effected by the dynamic nature of social media. • Presentation by Marie Van Cranenbroeck: “Managing and Using Unstable Data in a Social Science Research about Museums and Audiences on Social Media” – Data collection and storage challenges
  12. 12. Specific details and additions • Researchers and users may have different ideas about the definition of social media / social networks • Lack of evaluation standards • Availability of data (also: not enough data) • Data may be corrupt (e.g. missing data) • Social media as a moving target (Karpf, D. (2012). Social science research methods in Internet time. Information, Communication & Society 15(5):639-661. )
  13. 13. Meta discussion • Social media research can have various forms. Different disciplines involved. • Best practices and pitfalls in social media research are mainly discussed informally. Few possibilities to share unsuccessful approaches.
  14. 14. WHAT WE‘D LIKE TO LEARN TODAY • Towards a categorization of challenges for social media research: what can go wrong? • Collection of more experiences • Structuring them into different categories
  15. 15. WHAT WE‘D LIKE TO LEARN TODAY Today: - 4 presentations - Think about your own experiences! - … in connection to each presentation - … in general
  16. 16. 9:00 Introduction: “What we’ve learnt from the first workshop and what we’d like to learn today” 9:15 Shawn Walker: “Complexity of collecting social media data in ephemeral contexts” 9:40 Cornelius Puschmann: “Why LIWC sucks (or: saner options for social media content analysis)” 10:05 Break 10:20 Luca Rossi: “The fourth deadly sin of social media researchers (or: scientific research and unstable socio-technical platforms)” 10:45 Marco Toledo Bastos, “Individual Behavior from Aggregate Social Media Data“ 11:10 Discussion & Conclusions 12:00 End PROGRAM
  17. 17. • Other experiences? Share your thoughts! • Main categories of #fail cases? • Top 3 take away messages for next workshop? DISCUSSION
  18. 18. WHERE TO GO FROM HERE? • Next steps – lessons learnt for future workshop organisation • Which additional conferences? • Publication? Guidebook?
  19. 19. • Archiving: – URLs may vanish (Question: linear rate of decay?) – Images missing – Platforms changing (moving target!) – not just about the interface! • Visualization of results – Word cloud (compare histograms) • Tools – sentiment140, Internet Archive, GNIP • Methods – Content Analysis: • replicability? Validation? • Context for social media contents (e.g. surrounding tweets). • LIWC, General Inquirer – Predictions – „Data Science“ • Lack of theory • Data Quality: – Can we still cite/use data and research published in 2007/2008`? – Baseline? (how to define for a moving target) • Theory: – Can we only do descriptive work for single platforms? – Look for the theory instead for the data? • Meta – Systematic review of existing literature is needed • Documentation – Timeframe generalizaion – Document time, cultures? – How long will my results be valid? – Have a general base for comparison