The Pleasures and Perils of Studying
Social Media
Dr. Katrin Weller
GESIS – Leibniz-Institute for the Social Sciences
Dept. of Computational Social Science
Cologne, Germany
E-Mail: katrin.weller@gesis.org ●Twitter: @kwelle ● Web: www.katrinweller.net
Slides are available at: http://de.slideshare.net/katrinweller
2
SERIOUSLY? DO THEY NOT REALIZE THAT 99%
OF TWEETS ARE WORTHLESS BABBLE THAT
READ SOMETHING LIKE ‘JUST WOKE UP. GOING
TO STARBUCKS NOW. GETTING LATTE.’
READER’SCOMMENTFOUNDINTHECOMMENTSECTIONFORGROSS,D.(2010,APRIL14).LIBRARYOFCONGRESSTOARCHIVEYOURTWEETS.CNN.RETRIEVEDFROMHTTP://EDITION.CNN.COM/2010/TECH/04/14/LIBRARY.CONGRESS.TWITTER/,
RETRIEVEDNOVEMBER19.
PHOTOS:HTTPS://WWW.FLICKR.COM/SEARCH/?TEXT=COFFEE&LICENSE=4%2C5%2C6%2C9%2C10
Social media research output is growing
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
No. of publications (Scopus)
(TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
Pleasures in Studying Social Media
4
#1: New type of data
• Researchers value social media as a new type of data
• Previously „ephemeral data“ become visible
• Immediate – quick reaction to events
• Structured
• „natural“ data
5
“What I find really interesting is that structure becomes manifest in
internet communication. So it’s the first time in history actually that
we can, that social structures between people become manifest
within a technology. (...) They become visible, they become
crawlable, they become analyzable.”
Kinder-Kurlanda, Katharina E., and Katrin Weller. 2014. "'I always feel it must be great to be a hacker!': The role of interdisciplinary work
in social media research." In Proceedings of the 2014 ACM conference on Web Science, 91-98. New York: ACM.
Social Media Data
• Texts
• Images
• Videos
• Mixed formats / Multimedia
• Connections I (friends, followers)
• Connections II (links/URLs)
• Connections/Actions (likes, favs, comments, downloads)
#2: Various research topics
• User groups
• Events
• Audiences
• Practices
• Information flow
• Influence
• Opinions and sentiments
• Networks
• Interactions
• Predictions
• Language
• Culture
• Political communication
• Activism
• Crisis
communication/disaster
response
• E-learning
• Health
• Brand communication
#3: Multi-disciplinary environment
• Freedom to explore new approaches
• Multi-method
• Exchange with other disciplines
8
Scopus: 2000-today by subject area
10650; 36%
5542; 19%
2384
2288
2151
1535
773
772
65 Computer Science
Social Sciences
Engineering
Medicine
Business, Management and Accounting
Mathematics
Arts and Humanities
Decision Sciences
Psychology
Nursing
Economics, Econometrics and Finance
Biochemistry, Genetics and Molecular Biology
Health Professions
Environmental Science
Earth and Planetary Sciences
Agricultural and Biological Sciences
Pharmacology, Toxicology and Pharmaceutics
Physics and Astronomy
Materials Science
Multidisciplinary
Neuroscience
Immunology and Microbiology
Chemical Engineering
Veterinary
Dentistry
Chemistry
Energy
Perils? – Challenges!
10
#1: Model organisms in social media
research?
11
https://en.wikipedia.org/wiki/Model_organism#/media/File:Drosophil
a_melanogaster_-_side_(aka).jpg
Tufekci, Z. 2014. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls.
In Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media (ICWSM).
Social Media Research
0
100
200
300
400
500
600
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Twitter
Facebook
YouTube
Blogs
Wikis
Foursquare
LinkedIn
MySpace
Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus
Title Search. See:
Reasons?
• Accessibility
• Legal framework
• Ethical framework
13
#2: Comparability?
• Longitudinal studies
• Cross-plattform studies
• Comparisions across countries, populations, topics
• Lack of standards (methods, metadata, benchmark
datasets)
14
15
Different methods and types of datasets, examples from popular social science papers
Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences.
Knowledge Organization. 41(3), 238-248
Example 2008-2013 papers on Twitter and elections:
data sources
Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big
Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript.
16
Data source number
No information 11
Collected manually from Twitter website (Copy-Paste /
Screenshot)
6
Twitter API (no further information) 8
Twitter Search API 3
Twitter Streaming API 1
Twitter Rest API 1
Twitter API user timeline 1
Own program for accessing Twitter APIs 4
Twitter Gardenhose 1
Official Reseller (Gnip, DataSift) 3
YourTwapperKeeper 3
Other tools (e.g. Topsy) 6
Received from colleagues 1
# 3: Data Access
17
“But you can’t make your data available for others to look at, which means both
your study can’t really be replicated and it can’t be tested for review. But also it
just means your data can’t be made available for other people to say, Ah you
have done this with it, I’ll see what I can do with it, (…) There is no open data.”
Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2015. "Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden
Data of Social Media Research?." In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM
Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 – May 29, 2015,
28-37. Ann Arbor, MI: AAAI Press.
Available datasets
• From individual researchers/groups (sometimes
„black market“).
• From conferences: e.g. ICWSM
• Archival institutions: e.g. GESIS (doi:10.4232/1.12319)
18
http://www.politico.com/story/2015/07/library-of-
congress-twitter-archive-119698.html
https://www.insidehighered.com/views/2015/0
6/03/article-difficulties-social-media-research
Unvailable datasets
#5 Moving target
• Platforms and users change
• „Forgotton Features“
– Favs on Twitter
– Hashtags on Facebook
– …
20
Lost context: interfaces, look and feel
21
https://www.flickr.com/photos/jackdorsey/182613360
Lost context: interfaces, look and feel
22Facebook on Dec. 25, 2005. Via Internet Archive‘s Wayback Machine
Lost context: stories and meaning
#jan25
23
#6 Data loss
• Deleted tweets
24
25
26
27
Format supported by
Twitter Terms of services
#7: Ethics and Privacy
28
2005 2011
29
EXCITING
Pleasures or Perils?
Questions and Feedback
30
katrin.weller@gesis.org
@kwelle
http://katrinweller.net

Weller pleasures+perils social media

  • 1.
    The Pleasures andPerils of Studying Social Media Dr. Katrin Weller GESIS – Leibniz-Institute for the Social Sciences Dept. of Computational Social Science Cologne, Germany E-Mail: katrin.weller@gesis.org ●Twitter: @kwelle ● Web: www.katrinweller.net Slides are available at: http://de.slideshare.net/katrinweller
  • 2.
    2 SERIOUSLY? DO THEYNOT REALIZE THAT 99% OF TWEETS ARE WORTHLESS BABBLE THAT READ SOMETHING LIKE ‘JUST WOKE UP. GOING TO STARBUCKS NOW. GETTING LATTE.’ READER’SCOMMENTFOUNDINTHECOMMENTSECTIONFORGROSS,D.(2010,APRIL14).LIBRARYOFCONGRESSTOARCHIVEYOURTWEETS.CNN.RETRIEVEDFROMHTTP://EDITION.CNN.COM/2010/TECH/04/14/LIBRARY.CONGRESS.TWITTER/, RETRIEVEDNOVEMBER19. PHOTOS:HTTPS://WWW.FLICKR.COM/SEARCH/?TEXT=COFFEE&LICENSE=4%2C5%2C6%2C9%2C10
  • 3.
    Social media researchoutput is growing 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 No. of publications (Scopus) (TITLE-ABS-KEY("social media") OR TITLE-ABS-KEY("social web") OR TITLE-ABS-KEY("social software") OR TITLE-ABS-KEY("web 2.0")) AND PUBYEAR > 1999
  • 4.
    Pleasures in StudyingSocial Media 4
  • 5.
    #1: New typeof data • Researchers value social media as a new type of data • Previously „ephemeral data“ become visible • Immediate – quick reaction to events • Structured • „natural“ data 5 “What I find really interesting is that structure becomes manifest in internet communication. So it’s the first time in history actually that we can, that social structures between people become manifest within a technology. (...) They become visible, they become crawlable, they become analyzable.” Kinder-Kurlanda, Katharina E., and Katrin Weller. 2014. "'I always feel it must be great to be a hacker!': The role of interdisciplinary work in social media research." In Proceedings of the 2014 ACM conference on Web Science, 91-98. New York: ACM.
  • 6.
    Social Media Data •Texts • Images • Videos • Mixed formats / Multimedia • Connections I (friends, followers) • Connections II (links/URLs) • Connections/Actions (likes, favs, comments, downloads)
  • 7.
    #2: Various researchtopics • User groups • Events • Audiences • Practices • Information flow • Influence • Opinions and sentiments • Networks • Interactions • Predictions • Language • Culture • Political communication • Activism • Crisis communication/disaster response • E-learning • Health • Brand communication
  • 8.
    #3: Multi-disciplinary environment •Freedom to explore new approaches • Multi-method • Exchange with other disciplines 8
  • 9.
    Scopus: 2000-today bysubject area 10650; 36% 5542; 19% 2384 2288 2151 1535 773 772 65 Computer Science Social Sciences Engineering Medicine Business, Management and Accounting Mathematics Arts and Humanities Decision Sciences Psychology Nursing Economics, Econometrics and Finance Biochemistry, Genetics and Molecular Biology Health Professions Environmental Science Earth and Planetary Sciences Agricultural and Biological Sciences Pharmacology, Toxicology and Pharmaceutics Physics and Astronomy Materials Science Multidisciplinary Neuroscience Immunology and Microbiology Chemical Engineering Veterinary Dentistry Chemistry Energy
  • 10.
  • 11.
    #1: Model organismsin social media research? 11 https://en.wikipedia.org/wiki/Model_organism#/media/File:Drosophil a_melanogaster_-_side_(aka).jpg Tufekci, Z. 2014. Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls. In Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media (ICWSM).
  • 12.
    Social Media Research 0 100 200 300 400 500 600 20012002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Twitter Facebook YouTube Blogs Wikis Foursquare LinkedIn MySpace Number of publications per year, which mention the respective social media platform‘s name in their title. Scopus Title Search. See:
  • 13.
    Reasons? • Accessibility • Legalframework • Ethical framework 13
  • 14.
    #2: Comparability? • Longitudinalstudies • Cross-plattform studies • Comparisions across countries, populations, topics • Lack of standards (methods, metadata, benchmark datasets) 14
  • 15.
    15 Different methods andtypes of datasets, examples from popular social science papers Weller, K. (2014). What do we get from Twitter – and what not? A close look at Twitter research in the social sciences. Knowledge Organization. 41(3), 238-248
  • 16.
    Example 2008-2013 paperson Twitter and elections: data sources Weller, K. (2014). Twitter und Wahlen: Zwischen 140 Zeichen und Milliarden von Tweets. In: R. Reichert (Ed.), Big Data: Analysen zum digitalen Wandel von Wissen, Macht und Ökonomie (pp. 239-257). Bielefeld: transcript. 16 Data source number No information 11 Collected manually from Twitter website (Copy-Paste / Screenshot) 6 Twitter API (no further information) 8 Twitter Search API 3 Twitter Streaming API 1 Twitter Rest API 1 Twitter API user timeline 1 Own program for accessing Twitter APIs 4 Twitter Gardenhose 1 Official Reseller (Gnip, DataSift) 3 YourTwapperKeeper 3 Other tools (e.g. Topsy) 6 Received from colleagues 1
  • 17.
    # 3: DataAccess 17 “But you can’t make your data available for others to look at, which means both your study can’t really be replicated and it can’t be tested for review. But also it just means your data can’t be made available for other people to say, Ah you have done this with it, I’ll see what I can do with it, (…) There is no open data.” Weller, Katrin, and Katharina E. Kinder-Kurlanda. 2015. "Uncovering the Challenges in Collection, Sharing and Documentation: The Hidden Data of Social Media Research?." In Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. Proceedings Ninth International AAAI Conference on Web and Social Media Oxford University, May 26, 2015 – May 29, 2015, 28-37. Ann Arbor, MI: AAAI Press.
  • 18.
    Available datasets • Fromindividual researchers/groups (sometimes „black market“). • From conferences: e.g. ICWSM • Archival institutions: e.g. GESIS (doi:10.4232/1.12319) 18
  • 19.
  • 20.
    #5 Moving target •Platforms and users change • „Forgotton Features“ – Favs on Twitter – Hashtags on Facebook – … 20
  • 21.
    Lost context: interfaces,look and feel 21 https://www.flickr.com/photos/jackdorsey/182613360
  • 22.
    Lost context: interfaces,look and feel 22Facebook on Dec. 25, 2005. Via Internet Archive‘s Wayback Machine
  • 23.
    Lost context: storiesand meaning #jan25 23
  • 24.
    #6 Data loss •Deleted tweets 24
  • 25.
  • 26.
  • 27.
  • 28.
    #7: Ethics andPrivacy 28 2005 2011
  • 29.
  • 30.