Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
@dhirajmurthy 1
Grounded theory meets Big Data:
One way to marry ethnography and digital methods
May 2016
Dhiraj Murthy | ...
@dhirajmurthy 2
Objectives
• There are unique challenges associated with data collection
and analysis on social media plat...
@dhirajmurthy 3
Starting points
•  Big data methods successfully applied to Twitter data
(indeed 16% of research on Twitte...
@dhirajmurthy 4
New ontologies
So perhaps we need to …
challenge traditional ontological assumptions!
Hardt and Negri (200...
@dhirajmurthy 5
First: So what does Twitter API data look like
"user": {
"name": "dhirajmurthy",
"friendsCount": 771,
"fol...
@dhirajmurthy 6
What is often missing in Twitter-based research
•  Be open in the inquiry, allowing coding to be emergent....
@dhirajmurthy 7
Beyond induction and deduction…
•  ‘Big data is [..] most effective when researchers take
account of the c...
@dhirajmurthy 8
Beyond induction and deduction…
•  Abductive methods: a form of reasoning ‘for finding the
best explanatio...
@dhirajmurthy 9
Methods
Emergent coding
methods can be
implemented
operationally in a
systematic fashion
to build critical...
@dhirajmurthy 10
In Practice
•  Be open in the inquiry, allowing coding to be emergent.
•  Tweets are not merely bits of t...
@dhirajmurthy 11
Case study: Accidental Racist
@dhirajmurthy 12
Data collection and relationship model;
Figure adapted from Corbin, J. and
Strauss, A (2015), Basics of q...
@dhirajmurthy 13
•  Operationalizing this
ontology requires
several stages of
coding
•  Memo making during
collection and ...
@dhirajmurthy 14
Computational method first
•  One can effectively use
machine learning approaches
such as Latent Dirichle...
@dhirajmurthy 15
Conclusions
•  Social media are complex sociotechnical spaces
•  Presentation of the self is often highly...
@dhirajmurthy 16
Dhiraj Murthy
Reader of Sociology at Goldsmiths,
University of London
@dhirajmurthy
d.murthy@gold.ac.uk
@dhirajmurthy 17
References
Blaikie, N. (2004). Retroduction. In M. S. Lewis-Beck, A. Bryman & T. F. Liao (Eds.), The
SAGE...
@dhirajmurthy 18
Selected Work
Most can be downloaded from http://www.dhirajmurthy.com/about/
Twitter: Social Communicatio...
Upcoming SlideShare
Loading in …5
×

Grounded theory meets big data: One way to marry ethnography and digital methods

264 views

Published on

Dhiraj Murthy

Published in: Science
  • Be the first to comment

  • Be the first to like this

Grounded theory meets big data: One way to marry ethnography and digital methods

  1. 1. @dhirajmurthy 1 Grounded theory meets Big Data: One way to marry ethnography and digital methods May 2016 Dhiraj Murthy | @dhirajmurthy | d.murthy@gold.ac.uk CAST: Social Media Research Cluster
  2. 2. @dhirajmurthy 2 Objectives • There are unique challenges associated with data collection and analysis on social media platforms • How do we integrate and weigh Big Data questions and more in-depth contextualized analysis of social media content? • How do we categorize textual and visual content, addressing issues of ontology? • How can grounded theory be applied to coding schemes?
  3. 3. @dhirajmurthy 3 Starting points •  Big data methods successfully applied to Twitter data (indeed 16% of research on Twitter employed sentiment analysis (Zimmer and Proferes 2014) •  We may think that anything about human behavior can be deciphered from Twitter data, but that simply is not true. •  There are also challenges associated with data collection and analysis on Twitter (boyd & Crawford, 2012). •  Closed coding systems are thought to be the best for studying Twitter data •  However, social media data involves very ‘messy’ elements and mixed approaches can have high utility
  4. 4. @dhirajmurthy 4 New ontologies So perhaps we need to … challenge traditional ontological assumptions! Hardt and Negri (2005, p. 312) argue that this type of a critical ‘new ontology’ is part of their desire not to engage in “repeating old rituals”, but, rather, “launching a new investigation in order to formulate a new science of society and politics [… that] is not about piling up statistics or mere sociological facts [… but] immersing ourselves in the movements of history and the anthropological transformations of subjectivity.”
  5. 5. @dhirajmurthy 5 First: So what does Twitter API data look like "user": { "name": "dhirajmurthy", "friendsCount": 771, "followersCount": 1534, "listedCount": 100, "statusesCount": 2609, } This is an excerpt of API-delivered JavaScript Object Notation (JSON) data for my Twitter ID
  6. 6. @dhirajmurthy 6 What is often missing in Twitter-based research •  Be open in the inquiry, allowing coding to be emergent. •  Ask what is happening in the tweet (not just body text). Think about JSON data holistically. •  What are these tweet data helping us study, speaking broadly? •  Are we being reflexive on the point of view/standpoint we are interpreting? •  Are we being flexible or following prescribed rules?
  7. 7. @dhirajmurthy 7 Beyond induction and deduction… •  ‘Big data is [..] most effective when researchers take account of the complex methodological processes that underlie the analysis of that data’. boyd & Crawford (2012, p. 668) •  And inductive and deductive methods have their own limitations
  8. 8. @dhirajmurthy 8 Beyond induction and deduction… •  Abductive methods: a form of reasoning ‘for finding the best explanations among a set of possible ones’ (Paul, 1993) are alternative approach •  Retroduction: a type of abductive method that emphasizes “asking why” (Olsen, 2012: 215), researchers are able to probe the data regularly and to “avoid overgeneralisation but searching for reasons and causes” (p. 216) instead. Or put another way, “the retroductive researcher, unlike the inductive researcher, has something to look for” (Blaikie, 2004).
  9. 9. @dhirajmurthy 9 Methods Emergent coding methods can be implemented operationally in a systematic fashion to build critical, reflective, conceptual knowledge of Twitter-derived data. Theory building, Adapted from Goulding, C. (2002), Grounded Theory: Sage, p. 115
  10. 10. @dhirajmurthy 10 In Practice •  Be open in the inquiry, allowing coding to be emergent. •  Tweets are not merely bits of text. Ask what is happening in the tweet (not just body text). Think about JSON object data holistically (c.f. Manovich’s (2001) ‘digital objects’). •  What are these tweet data helping us study, speaking broadly? •  Are we being reflexive on the point of view / standpoint we are interpreting? •  Are we being flexible or following prescribed rules?
  11. 11. @dhirajmurthy 11 Case study: Accidental Racist
  12. 12. @dhirajmurthy 12 Data collection and relationship model; Figure adapted from Corbin, J. and Strauss, A (2015), Basics of qualitative research: techniques and procedures for developing grounded theory, Thousand Oaks: Sage, pg. 8 Continuous open coding Twitter data model applied to #accidentalracist, a hashtag associated with a 2013 duet by Brad Paisley and LL Cool J
  13. 13. @dhirajmurthy 13 •  Operationalizing this ontology requires several stages of coding •  Memo making during collection and analysis is integral to both coding development and theory building •  Comparisons across diverse data at each stage provide reflexivity and triangulation
  14. 14. @dhirajmurthy 14 Computational method first •  One can effectively use machine learning approaches such as Latent Dirichlet allocation (LDA) to derive topic clusters around a Twitter corpus •  This can be used to inform what coding categories are deployed for not only tweet content, but profiles and other metadata •  Example: Topic clusters derived from 90,986 cancer-related tweets (with keywords: cancer, mammogram, lymphoma, melanoma, and cancer survivor)
  15. 15. @dhirajmurthy 15 Conclusions •  Social media are complex sociotechnical spaces •  Presentation of the self is often highly nuanced – a case particularly complicated with uses of humor, a frequent theme on Twitter •  Coded content can present different perspectives on social interactions and these data are complementary to computational methods •  Combining emergent grounded theory with machine learning or vice versa can advance both qualitative and computational methods
  16. 16. @dhirajmurthy 16 Dhiraj Murthy Reader of Sociology at Goldsmiths, University of London @dhirajmurthy d.murthy@gold.ac.uk
  17. 17. @dhirajmurthy 17 References Blaikie, N. (2004). Retroduction. In M. S. Lewis-Beck, A. Bryman & T. F. Liao (Eds.), The SAGE Encyclopedia of Social Science Research Methods (pp. 973). Thousand Oaks: Sage. boyd, d., & Crawford, K. (2012). Critical questions for Big Data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662-679. Corbin, J., & Strauss, A. (2015). Basics of qualitative research : techniques and procedures for developing grounded theory. Los Angeles: Sage. Hardt, M., & Negri, A. (2005). Multitude war and democracy in the age of Empire, New York: Penguin. Murthy, D. (2011). Emergent digital ethnographic methods for social research. Handbook of Emergent Technologies in Social Research, Oxford University Press, Oxford, 158-179. Olsen, W. K. (2012). Data collection : key debates and methods in social research. London; Thousand Oaks, Calif.: SAGE. Paul, G. (1993). Approaches to abductive reasoning: an overview. Artificial Intelligence Review, 7(2), 109-152. Zimmer, M., & Proferes, N. J. (2014). A topology of Twitter research: disciplines, methods, and ethics. Aslib Journal of Information Management, 66(3), 250-261. doi: doi:10.1108/ AJIM-09-2013-0083.
  18. 18. @dhirajmurthy 18 Selected Work Most can be downloaded from http://www.dhirajmurthy.com/about/ Twitter: Social Communication in the Twitter Age. 2013, with Polity Press ‘Big Data Solutions On a Small Scale: Evaluating Accessible High Performance Computing for Social Research’, Big Data and Society (with Bowman, S.), 2014 Modeling virtual organizations with Latent Dirichlet Allocation: A case for natural language processing‘, Neural Networks (with Gross, A.), Volume 58, pp. 38-49, 2014. ‘Social Media, Collaboration, and Scientific Organizations.’ American Behavioral Scientist., (with Lewis, J.P.), 2014. ‘Comparing Print Coverage and Tweets in Elections: a Case Study of the 2011-2012 US Republican Primaries‘, Social Science Computer Review (with Petto, L.), 2014 ‘Twitter and Disasters: the uses of Twitter during the 2010 Pakistan floods‘, Information Communication and Society, Volume 16, Issue 6, 2013, pp. 837-855. ‘Emergent Data Mining Tools for Social Network Analysis‘ in Data Mining in Dynamic Social Networks and Fuzzy Systems (Bhatnagar, V. ed.), pp 40-57 , (with Gross, A. and Takata, A.), 2013. ‘Evaluation and Development of Data Mining Tools for Online Social Networks’ in Mining Social Networks and Security Informatics ( Özyer, T. et al. eds.) , pp 183-202 (with Gross, A., Takata, A., Bond, S.), 2013. Evaluation and Development of Data Mining Tools for Online Social Networks. Murthy, D., Gross, A., Oliveira, D. ‘Understanding Cancer-based Networks in Twitter using Social Network Analysis’ in IEEE International Conference on Semantic Computing Proceedings. Palo Alto, California, 2011

×