Computational Social Science and microposts - The good, the bad and the ugly

  • 1,291 views
Uploaded on

#Microposts2014 keynote at WWW2014, Seoul, Korea

#Microposts2014 keynote at WWW2014, Seoul, Korea

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,291
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
16
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. 1 Markus Strohmaier GESIS – Leibniz Institute for the Social Sciences &  U. of Koblenz Computational Social Science and microposts ‐ The good, the bad and the ugly #Microposts2014  at WWW’2014 Korea, Seoul #Microp @WWW Seoul,  www.markusstrohmaier.info
  • 2. Found Social Data 2 erosion accretion References: Webb, E. J., et al. (1966). Unobtrusive methods: Nonreactive research in the social sciences. Everybody lies to her doctor, and the Hawthorne effect
  • 3. Computational Social Science CSSSA: http://computationalsocialscience.org/ Computational Social Science: “The science that investigates social phenomena through the  medium of computing and algorithmic data processing.”  [adapted from CSSSA] 3 • Harvard iQS,  • Stanford IRiSS, • CMU CASOS, • ESRC COSMOS • Web Observatories • …
  • 4. Computational Social Science: Example Stanley Milgram (1967) • Social Scientist • Theory: A small world • 6 degrees of separation Jure Leskovec (2008) • Computer Scientist • (Found) Data: 240 mio users • 7 degrees of separation 4
  • 5. What social media platforms are we focusing on? 5Weller, K. (2014). Bibliometric analysis of social media research: Publication output for different social media platforms. Blog post,  07.04.2014. Retrieved from: http://kwelle.wordpress.com/2014/04/07/bibliometric‐analysis‐of‐social‐media‐research/ Blogs Microposts
  • 6. What disciplines are interested in microposts? 6Zimmer, Michael / Proferes, Nicholas. (In press). "A topology of Twitter research: Disciplines, methods, and ethics". To appear in Aslib Journal of Information Management 3(66), special issue on Twitter data analysis.
  • 7. THE GOOD Computational Social Science and Microposts 7
  • 8. Observing political movements and conversations ~ 1. day of the protests Days on which the internet was  shut down Based on an analysis of about 100 mio tweets on egypt Markus Strohmaier, work at (XEROX) PARC in collaboration with Lichan Hong (then at PARC, now at Google) Analysis of political conversations on Twitter (Egyptian revolution 2011) 8
  • 9. Assessing online conversational practices of political parties 9 Lada Adamic and Natalie Glance. The political blogosphere and the 2004 US election: divided they blog. Proceedings of  the 3rd international workshop on Link discovery. ACM, 2005.
  • 10. Assessing online conversational practices of political parties on Twitter 10 During the German National Election 2013 Haiko Lietz, Claudia Wagner, Arnim Bleier, and Markus Strohmaier, When Politicians Talk: Assessing Online Conversational Practices of  Political Parties on Twitter, The International AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 
  • 11. More examples Predicting… • personality from twitter Golbeck et al. 2011 • depression via social media De Choudhury et al. 2013 • elections with Twitter Tumasjan et al. 2010 • crime using Twitter Gerber 2014 • stock market indicators Zhang et al. 2011 • flu trends using twitter data Achrekar et al. 2011 • box-office revenues Asur & Huberman 2010 11
  • 12. THE BAD Computational Social Science and Microposts 12
  • 13. You cannot predict elections with Twitter Election results % of tweets CDU 28,4 18,6 CSU 6,8 3,0 SPD 24,0 14,7 FDP 15,2 11,2 Linke 12,4 8,3 Grüne 11,1 9,3 Piraten 2,1 34,8 13 Why the Pirate Party Won the German Election of 2009 or The Trouble With Predictions: A Response to Tumasjan, A., Sprenger, T. O.,  Sander, P. G., & Welpe, I. M. “Predicting Elections With Twitter: What 140 Characters Reveal About Political Sentiment” Jungherr, A.,  Jürgens, P., and Schoen, H. 2011. In  Social Science Computer Review. Daniel Gayo‐Avello: No, You Cannot Predict Elections with Twitter. IEEE Internet Computing 16(6): 91‐94 (2012)
  • 14. You cannot identify users‘ perceived expertise based on tweets 14Claudia Wagner, Vera Liao, Peter Pirolli, Les Nelson and Markus Strohmaier, It's not in their tweets: Modeling topical expertise of Twitter users,  ASE/IEEE International Conference on Social Computing (SocialCom2012), Amsterdam, The Netherlands, 2012.  What is Peer‘s expertise? It‘s not in his tweets # topics
  • 15. You cannot assume users behave consistently over time 15Haiko Lietz, Claudia Wagner, Arnim Bleier, and Markus Strohmaier, When Politicians Talk: Assessing Online Conversational Practices of  Political Parties on Twitter, The International AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014.  During the German National Election 2013
  • 16. You cannot assume users behave consistently over time 16 votes P. Singer, F. Flöck, C. Meinhart, E. Zeitfogel and M. Strohmaier. Evolution of Reddit: From the Front Page of the Internet to a self‐referential  community? In 23rd International World Wide Web Conference (WWW2014), Web‐Science Track, Seoul, Korea, 2014. The  Frontpage of  the Internet A self‐ referential communityImages Video Text self.reddit
  • 17. THE UGLY Computational Social Science and Microposts 17 What are the reasons for some of the „bad“? 
  • 18. Demographic Biases 18 Mislove, Alan, et al. "Understanding the Demographics of Twitter Users." ICWSM (2011).
  • 19. Social Bots 19 C. Wagner, S. Mitter; C. Körner and M. Strohmaier. When social bots attack: Modeling susceptibility of users in online social networks. In  Proceedings of the 2nd Workshop on Making Sense of Microposts (MSM'2012), held in conjunction with the 21st World Wide Web  Conference (WWW'2012), Lyon, France, 2012
  • 20. Impact of social bot attacks 20 S. Mitter, C. Wagner, and M. Strohmaier. Understanding the impact of socialbot attacks in online social networks. In ACM Web Science  2013, May 2‐4th, Paris, France, 2013.
  • 21. URL Shortener spam 21F. Klien and M. Strohmaier. Short links under attack: Geographical analysis of spam in a url shortener network. In Proceedings  of the 23rd Conference on Hypertext and Social Media (HT2012). ACM, 2012. data from qr.cx
  • 22. URL Shortener spam 22 https://www.youtube.com/watch?v=06Mhn0L23Tk F. Klien and M. Strohmaier. Short links under attack: Geographical analysis of spam in a url shortener network. In Proceedings  of the 23rd Conference on Hypertext and Social Media (HT2012). ACM, 2012.
  • 23. Context Collapse 23http://speakingofresearch.com/2014/02/27/fact‐into‐fiction‐why‐context‐matters‐with‐animal‐images/
  • 24. Invisible Engagement 24Zeynep Tufekci, Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls, The International  AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 
  • 25. Challenges Huberty (2014) • N != all we have both N<all and N>all • All (today) != All (tomorrow) user populations change • Online behavior != Offline behavior multi-faceted identities • Behavior of all (today) != Behavior of all (tomorrow) behavior changes and evolves 25I expected a Model T, but instead I got a loom: Awaiting the second big data revolution, Mark Huberty (2014).
  • 26. What makes matters even worse… 26 = What about: ? …? Zeynep Tufekci, Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls, The International  AAAI Conference on Weblogs and Social Media (ICWSM2014), Ann Arbor, MI, US, 2014. 
  • 27. Opportunities Sometimes though, • N = all (or almost all) • All (today) is similiar enough to All (tomorrow) • Online behavior approximates offline behavior • Behavior of all (today) predicts Behavior of all (tomorrow) 27
  • 28. Where do we go from here? • An opportunity for computational social science – Hypothesis exploration & validation – Triangulation (panels, statistics about societies, etc) – Open industrial and governmental data – Anonymization, reproducability and data archiving – Living labs and mass experimentation – …? 28
  • 29. Thank you! Markus Strohmaier