Making our mark: the important role of social scientists in the ‘era of big data’ - Rebecca Eynon

2,474 views

Published on

Presentation given at the HEA Social Sciences learning and teaching summit 'Exploring the implications of ‘the era of big data’ for learning and teaching'.

A blog post outlining the issues discussed at the summit is available via: http://bit.ly/1lCBUIB

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,474
On SlideShare
0
From Embeds
0
Number of Embeds
1,709
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • (think of how we were able to converge upon more meaningful clusters with some iteration)
  • Making our mark: the important role of social scientists in the ‘era of big data’ - Rebecca Eynon

    1. 1. Making our mark: the important role of social scientists in the ‘era of big data’ Dr Rebecca Eynon Oxford Internet Institute University of Oxford
    2. 2. Overview  Big data: hype and reality  Use of big data should not be a specialism of only a few social scientists  What kinds of skills and knowledge do social scientists need?
    3. 3. The allure of big data
    4. 4. Big data: the end of social science as we know it?  “Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.” The end of theory: the data deluge makes the scientific method obsolete (Chris Anderson, Wired Magazine, 2008)
    5. 5. The coming crisis of empirical sociology  “A world inundated with complex processes of social and cultural digitization; a world in which commercial forces predominate; a world in which we, as sociologists, are losing whatever jurisdiction we once had over the study of the ‘social’ as the generation, mobilization and analysis of social data become ubiquitous” (Savage and Burrows, 2009:763)
    6. 6. A valuable addition or a radical rethink?  An open question  Big data is not perfect  But it is not just hype
    7. 7. Why big data is not perfect (1)  Big data prioritises certain people  Who has access to data is not straightforward  Data as a commodity  Commercial vs. public  Availability of data tends to drive the questions  Questions that are difficult to measure / collect data on are dropped
    8. 8. Why big data is not perfect (2)  Just because data is available does not mean we should use it  Privacy in public, public trust and accountability  How we use results from big data approaches matters  Risks of misuse of data, power structures in society
    9. 9. Social science is well positioned to address these issues  But are we doing enough?  We are at risk of handing over aspects of social science to computer scientists, physicists and engineers  Few social science journals publish findings from big data  A lot of funding is going outside social science for questions that we used to be solely responsible for addressing
    10. 10. Learning & teaching  Data science courses have options in social science  Few courses in social science that offer data science  Students have to seek out opportunities for themselves  If data scientists can learn about social science then social scientists can learn about data science
    11. 11. What kinds of skills and knowledge do social scientists need?  On a continuum  We do not all need to be experts, but we need to know enough  Undergraduate & postgraduate  Ultimately, the use of big data will always be a team exercise
    12. 12. Language of multidisciplinary work  Need to be able to speak multiple ‘languages’ of the different disciplines  Or learn how to build a common vocabulary within specific project teams about the data, the different methods, the findings etc
    13. 13. Awareness of cultural differences  “In many cases when we analyse big datasets we see patterns that are not intuitive. Of course we need to build a theory (model, in our language) to explain the observation, but in many cases I was asked why I think the data looks like this and even sometimes: "your observation cannot be correct". I guess this is rooted in the differences in the disciplines. In social sciences usually you build a theory and then gather the data to support it, where is in data-driven sciences you first observe something and then try to build a theory. Usually the observation can't be wrong (unless your measurements are wrong for technical reasons).” (Data Scientist, OII)
    14. 14. Ethics of big data  Clear understandings of the ethical implications of gathering, storing and using big data  Personal codes vs institutional arrangements  Difference between law and ethical practice  Recognition of “privacy in public” and general respect for people  Care over what we do with the data and how our work is used  Commitment to public debate and transparency about the use of this data
    15. 15. Understanding the data  Thinking about data differently, and what constitutes data  Understanding the representation of the data  Linking data sets
    16. 16. Being clear about the data  “Usually data people are careless with words. They tend to give names to their observed parameters which can be misleading. They count how many times two people have called each other during a 6 month period and call this quantity "friendship strength". They count how many times people have mentioned Obama in their tweets and call it the "political index" of the user.... What I like about social scientists is that they are very careful with words and terms and their definitions.” (Data Scientist, OII)
    17. 17. Awareness and use of mixed method designs  Working within a pragmatic paradigm  Three levels of data  structural description (patterns of interactions);  thin descriptions, which note the content of the interaction  thick description, to provide rich context and convey the meaning of events to those who participated in them (Welser et al., 2008)  Linking methods at three different levels can be very valuable
    18. 18. Understanding the analysis  Having an intuition for what processes/algorithms are being applied to datasets, particularly in the context of the application domain (e.g. knowing the application domain very well) to be able to refine approaches  “[The Sociologist] always asks me, “Okay show me a code and explain to me which part of the code is doing which part, just very brief understanding of how this computer program is working”. So I was learning some sociology from her and she is learning some computer science programming skills from me so it’s kind of mutual.” (Sloan Big Data Project, http://www.oii.ox.ac.uk/research/projects/?id=98)
    19. 19. Interpretation  Crucial – the core role of the social scientist in big data projects  An ability to write “the story” for different audiences  Not possible if we do not understand (at least at some level what has happened at all stages of the research process)
    20. 20. Expertise among project partners (Williford and Henry, 2012)
    21. 21. Learning & teaching within the wider ecology of HE  Training for policy makers  Training for current academics  Interdisciplinary support structures across universities  Assessment process for student work  Challenges for the individual doctoral student  REF, early career support and job opportunities

    ×