SlideShare a Scribd company logo
1 of 21
Accessing and Using Big Data to
Advance Social Science Knowledge
Eric Meyer, Ralph Schroeder, Linnet
Taylor and Josh Cowls
Overview
• Project Introduction (Eric)
– Papers
– Conferences and outreach
– Data sources
• Big data theory, definition, limits to
science and its uses (Ralph)
• Case study: using big data to gauge public
opinion (Josh)
Accessing and Using Big Data to Advance Social
Science Knowledge
• October 2012 – August 2014
• Funded by the Alfred P. Sloan Foundation
• Data sources
• 120+ interviews, mainly with social scientists
• Workshops and conferences
• No representative sample, but some patterns of
disciplinary and skills background and career
trajectory
Project scope to date
• Economics
• Business (Nemode project with Greg, Monica)
• Wikipedia
• Social Media
• Development
• Political Science
• Causation and Theory
• Policy
• (20+ blog posts on a range of other issues)
Outcomes and outreach
• A series of workshops hosted at the Oxford Internet Institute:
– 2013, March. Big Data: Rewards and Risks for the Social Sciences.
Oxford Internet Institute, Oxford.
http://www.oii.ox.ac.uk/events/?id=557
– 2013, March. Qualitative methods to study data practices: An
international comparative workshop. Oxford Internet Institute, Oxford
– 2013, January. Towards a Sociology of Data. Oxford Internet Institute,
Oxford. http://www.oii.ox.ac.uk/events/?id=560
– 2014, May. Big data and social change in the developing world,
Rockefeller Foundation Bellagio Centre conference
– 2014, August. American Political Science Association Panel on ‘Big
Data and Political Science’
• MSc option course ‘Big Data and Society’
Definition
• ‘Big data’
– the advance of knowledge via a leap in the scale
and scope in relation to a given object or
phenomenon
• ‘Data’
– Belongs to the object
– ‘taking…before interpreting’ (Ian Hacking)
• the view that ‘all data are of their nature interpreted’ is
misleading: ‘data are made, but as a good first
approximation, the making and taking come before
interpreting’
– The most atomizable useful unit of analysis
Digital Objects and their Referents
Digital Object
(Examples:
Twitter, Tesco
Loyalty card
information
Real World
(People /
Physical
Objects)
Represent / Manipulate
Representing
Manipulating
Limits
Digital Data
Uses and Limits
• Big data research uses (academic, commercial, government) are limited to
the exploitation of suitable objects, and the objects which ‘give off’ digital
data, and the phenomena they lay bare, are limited
• The knowledge produced is aimed at ‘sorting people’ and advancing
‘representing and intervening’ (but without ‘manipulating’, except where
this is warranted by practical economic and political objectives)
• Difference commercial versus academic world is that knowledge provides
competitive and practical advantage as against advancing (high-consensus
rapid-discovery) knowledge
– The limits in both cases are the objects (to which the data ‘belong’), and that need to
have available digitally manipulable data points
• How available these objects are differs, but also…
– Causation and theoretical embedding matters for academic social science
– For commercial (and non-academic uses), ‘predicting’ consumer choices and other
behaviours, for limited purposes and without increasing scientific knowledge, is good
enough
• There are many objects, for non-academics and scientists to humanities
scholars (physical, human, cultural), but they are not infinite
• This availability, not skills or other issues, determines the future of big data
research
(Big) data definition enables
pinpointing impacts and threats
• ‘Google Plus may not be much of a competitor to Facebook as
a social network, but…some analysts…say that Google
understands more about people’s social activity than
Facebook does.’
– New York Times, 15.2. 2014, p. A1 ‘The Plus in Google Plus? It’s Mostly for Google’.
• Facebook Likes: ‘Predicting users’ individual attributes and
preferences can beused to improve numerous products and
services. For instance, digital systems and devices (such as
online stores or cars) could be designed to adjust their
behavior to best fit each user’s inferred profile…online
insurance…advertisements might emphasize security when
facing emotionally unstable (neurotic) users but stress
potential threats when dealing with emotionally stable ones’
– ‘Private traits and attributes are predictable from digital records of human behavior.’ Kosinski M,
Stillwell D, Graepel T.,Proc Natl Acad Sci 2013 Apr 9;110(15):5802-5.
• More powerful knowledge will enable better services, and
more manipulation
Ethical and Social Issues in Big Data Research
• Objects with ‘total’ knowledge (universes)
– Danger is inferring behaviour not of individuals, but of classes of
people
• Asymmetry of knower and the subjects of knowledge is
greater than elsewhere
• Based not on individuals’ but on aggregate behaviour
– Hence only utilitarian, not Kantian justification?
• Why does prediction or uncovering laws of behaviour ‘grate’?
• Benefits: greater scientific power and more specific details
• Relation to smaller data? ‘Creep’
• Solution: ethical = greater researcher and public awareness,
regulatory (would apply to academic researchers?) = prevent
legal and specific harms
Outlook and Implications
• There is an overlap between real world research and
the world of academic research which is closer than
elsewhere
– because this is the research front in both
– because they share common objects
• For research
– Develop theoretical frame in which to embed big data (for
social media), including power/function, relation to
traditional media, and role in society
• For society
– Awareness of how research can generate transparency and
manipulability
• Big Brother?
– Yes, but also Brave New World of Omniscience, with Social
Science as Handmaiden
Exemplar case:
using big data to gauge public opinion
Opportunities and challenges
The traditional model
• The notion of public opinion
was enlivened by the coffee
houses of 17th Century Britain
• Inferential statistics provided a
rigorous, replicable basis for
reporting public opinion, based
on a random sample and MOE
• Remained expensive, random
sample difficult to construct,
response rates dwindling
Using big data approaches
• n=all: beyond the sample
• Cheaper (after initial investment)
• More granularity, more insight?
Cost
Utility
Traditional inferential model Big data model
But challenges remain…
• Representativeness
• Reliability
• Replicability
The challenge of representativeness
≠
Amber Boydstun: anyone who does a Twitter study
has to really work hard, I've noticed, to justify why
we should care about Twitter because Twitter is not
representative of the United States population or
the world population, right. And it's not. And even
if it was representative, or even if we don't care that
it's not representative, it's really hard to figure out in
any given study whether you're getting an over-
sampling of those users who are just more active
than other users.
Data from Dutton, W.H. and Blank, G., with Groselj, D.
(2013) Cultures of the Internet: The Internet in Britain.
Oxford Internet Survey 2013. Oxford Internet Institute,
University of Oxford.
The challenge of reliability
Mike Thelwall: really the big problem that
we haven’t cracked is that if someone
tweets a sentiment it’s not necessarily what
they’re feeling, it can be for a variety of
reasons, so it doesn’t really reflect directly
what they feel necessarily … so it’s quite a
stretch to say that if someone tweets, “I’m
happy” that they’re actually happy, to give
a simple example
• Difficult to establish the meaning of latent messages
• Platform specific behaviours (e.g. hashtags, likes) are not
always understood
• Political discourse often laced with sarcasm
The challenge of replicability
• Social data is often proprietary – getting access can be
difficult, expensive or impossible
• Sometimes access is limited to output – analysis takes
inside black box
• Challenges basic Popperian assumption of falsifiability
Nick Anstead: there are all these
companies that do all this wonderful stuff,
but actually as an academic researcher,
using them is expensive … what do you
actually get from working with these
companies? Do you get raw data sets that
you go and do stuff with yourself? More
commonly, I would suggest, what you
probably get is access to, sort of, a black
box tool.
… and the implications aren’t just
academic
Shelton, T et al, ‘Mapping the data shadows of
Hurricane Sandy: Uncovering the sociospatial
dimensions of ‘big data’. Geoforum Volume 52,
March 2014, pps 167-79
Project Papers
Schroeder, Ralph (Forthcoming). ‘Big Data: Towards a More Scientific Social Science and Humanities’ in Mark Graham and William H
Dutton (eds.), Society and the Internet: How Networks of Information are Changing our Lives. Forthcoming.
Schroeder, Ralph, & Taylor, Linnet (Forthcoming). ‘Is bigger better? The emergence of big data as a tool for international development
policy.’ GeoJournal.
Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet (2013, August). ‘Big Data in the Study of Twitter, Facebook and Wikipedia: On the Uses
and Disadvantages of Scientificity for Social Research.’ Paper presented at the proceedings of the Annual Meeting of the American
Sociological Association. (being submitted)
Schroeder, Ralph, & Taylor, Linnet. ‘Big Data and Wikipedia Research: Social Science Knowledge across Disciplinary Divides’. Submitted to
Information, Communication and Society.
Taylor, Linnet. ‘No place to hide? The ethics and analytics of tracking mobility using African mobile phone data. Submitted to Population,
Space and Place.
Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet. ‘Big Data in the Social Sciences: Towards a New Research Paradigm?’ (being
submitted).
Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet (2013, November). ‘The Boundaries of Big Data.’ Paper presented at SIG-SI Symposium,
ASIST 2013, November 1-6, 2013, Montreal, Quebec, Canada.
Schroeder, Ralph and Cowls, Josh. ‘Answering Questions and Questioning Answers in the Era of Big Data.’ In preparation.
Taylor, Linnet, Meyer, Eric T., & Schroeder, Ralph. ‘Bigger and better, or more of the same? Emerging practices and perspectives on big
data analysis in economics”. Forthcoming in Big Data & Society.
Cowls, Josh. ‘The Crowd in the Cloud?’, forthcoming presentation and IPP 2014’
Cowls, Josh ‘Big Data and Policy Implementation’, in preparation.
Schroeder, Ralph ‘Big Data and Policy Implications’, in preparation.

More Related Content

What's hot

Computational Social Science
Computational Social Science Computational Social Science
Computational Social Science Kiarash Kiani
 
Peace on Facebook? Problematising social media as spaces for intergroup conta...
Peace on Facebook? Problematising social media as spaces for intergroup conta...Peace on Facebook? Problematising social media as spaces for intergroup conta...
Peace on Facebook? Problematising social media as spaces for intergroup conta...Paul Reilly
 
Computational Social Science
Computational Social ScienceComputational Social Science
Computational Social Sciencejournal ijrtem
 
Redistricting and Voting Technology
Redistricting and Voting TechnologyRedistricting and Voting Technology
Redistricting and Voting TechnologyMicah Altman
 
Filter bubble and information behaviour, ISIC 2018, keynote speech
Filter bubble and information behaviour, ISIC 2018, keynote speechFilter bubble and information behaviour, ISIC 2018, keynote speech
Filter bubble and information behaviour, ISIC 2018, keynote speechSabina Cisek
 
Managing Confidential Information – Trends and Approaches
Managing Confidential Information – Trends and ApproachesManaging Confidential Information – Trends and Approaches
Managing Confidential Information – Trends and ApproachesMicah Altman
 
Digital Trails Dave King 1 5 10 Part 1 D3
Digital Trails   Dave King   1 5 10   Part 1 D3Digital Trails   Dave King   1 5 10   Part 1 D3
Digital Trails Dave King 1 5 10 Part 1 D3Dave King
 
Information, Science, and Society
Information, Science, and SocietyInformation, Science, and Society
Information, Science, and SocietyMelanie Swan
 
What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...Liliana Bounegru
 
Joe keating - world legal summit - ethical data science
Joe keating  - world legal summit - ethical data scienceJoe keating  - world legal summit - ethical data science
Joe keating - world legal summit - ethical data scienceJoe Keating
 

What's hot (20)

Computational Social Science
Computational Social Science Computational Social Science
Computational Social Science
 
Peace on Facebook? Problematising social media as spaces for intergroup conta...
Peace on Facebook? Problematising social media as spaces for intergroup conta...Peace on Facebook? Problematising social media as spaces for intergroup conta...
Peace on Facebook? Problematising social media as spaces for intergroup conta...
 
Ties
TiesTies
Ties
 
Computational Social Science
Computational Social ScienceComputational Social Science
Computational Social Science
 
Redistricting and Voting Technology
Redistricting and Voting TechnologyRedistricting and Voting Technology
Redistricting and Voting Technology
 
Filter bubble and information behaviour, ISIC 2018, keynote speech
Filter bubble and information behaviour, ISIC 2018, keynote speechFilter bubble and information behaviour, ISIC 2018, keynote speech
Filter bubble and information behaviour, ISIC 2018, keynote speech
 
The Birth of Social Media Methods
The Birth of Social Media MethodsThe Birth of Social Media Methods
The Birth of Social Media Methods
 
Critical Debates in Internet Studies
Critical Debates in Internet StudiesCritical Debates in Internet Studies
Critical Debates in Internet Studies
 
Managing Confidential Information – Trends and Approaches
Managing Confidential Information – Trends and ApproachesManaging Confidential Information – Trends and Approaches
Managing Confidential Information – Trends and Approaches
 
Digital Trails Dave King 1 5 10 Part 1 D3
Digital Trails   Dave King   1 5 10   Part 1 D3Digital Trails   Dave King   1 5 10   Part 1 D3
Digital Trails Dave King 1 5 10 Part 1 D3
 
2053951715611145
20539517156111452053951715611145
2053951715611145
 
Information, Science, and Society
Information, Science, and SocietyInformation, Science, and Society
Information, Science, and Society
 
ase-social-informatics (6)
ase-social-informatics (6)ase-social-informatics (6)
ase-social-informatics (6)
 
What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...
 
Rogers digitalmethods 4nov2010
Rogers digitalmethods 4nov2010Rogers digitalmethods 4nov2010
Rogers digitalmethods 4nov2010
 
Joe keating - world legal summit - ethical data science
Joe keating  - world legal summit - ethical data scienceJoe keating  - world legal summit - ethical data science
Joe keating - world legal summit - ethical data science
 
PhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher ThomasPhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher Thomas
 
Today's Data Grow Tomorrow's Citizens
Today's Data Grow Tomorrow's CitizensToday's Data Grow Tomorrow's Citizens
Today's Data Grow Tomorrow's Citizens
 
Just Google it! [slides]
Just Google it! [slides]Just Google it! [slides]
Just Google it! [slides]
 
Power Shift: The Fifth Estate
Power Shift: The Fifth EstatePower Shift: The Fifth Estate
Power Shift: The Fifth Estate
 

Viewers also liked

Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Jisc
 
Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...
Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...
Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...Han Woo PARK
 
Moving from small science to big science: Social and organizational impedimen...
Moving from small science to big science: Social and organizational impedimen...Moving from small science to big science: Social and organizational impedimen...
Moving from small science to big science: Social and organizational impedimen...Eric Meyer
 
Internet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam UniversityInternet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam Universitymwe400
 
Search Tasks, Proactive Search & Digital Assistants
Search Tasks, Proactive Search & Digital AssistantsSearch Tasks, Proactive Search & Digital Assistants
Search Tasks, Proactive Search & Digital AssistantsRishabh Mehrotra
 
Designing the search experience by tyler tate : twigkit
Designing the search experience by tyler tate : twigkitDesigning the search experience by tyler tate : twigkit
Designing the search experience by tyler tate : twigkitTyler Tate
 
Social impacts information technology
Social impacts information technologySocial impacts information technology
Social impacts information technologyRimple Darra
 

Viewers also liked (7)

Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
 
Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...
Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...
Big Data and Social Science Theory: Leveraging Large Scale Data to Discover N...
 
Moving from small science to big science: Social and organizational impedimen...
Moving from small science to big science: Social and organizational impedimen...Moving from small science to big science: Social and organizational impedimen...
Moving from small science to big science: Social and organizational impedimen...
 
Internet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam UniversityInternet Archives and Social Science Research - Yeungnam University
Internet Archives and Social Science Research - Yeungnam University
 
Search Tasks, Proactive Search & Digital Assistants
Search Tasks, Proactive Search & Digital AssistantsSearch Tasks, Proactive Search & Digital Assistants
Search Tasks, Proactive Search & Digital Assistants
 
Designing the search experience by tyler tate : twigkit
Designing the search experience by tyler tate : twigkitDesigning the search experience by tyler tate : twigkit
Designing the search experience by tyler tate : twigkit
 
Social impacts information technology
Social impacts information technologySocial impacts information technology
Social impacts information technology
 

Similar to Accessing Big Data to Advance Social Sciences

Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Lauri Eloranta
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...The Higher Education Academy
 
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...The Higher Education Academy
 
Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...Tim Highfield
 
Studying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & BiasStudying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & Biasgloriakt
 
Grounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methodsGrounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methodsCitizens in the Making
 
Meyer Big Data SDP13
Meyer Big Data SDP13Meyer Big Data SDP13
Meyer Big Data SDP13Eric Meyer
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesChantal van Son
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social mediaFarida Vis
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitterKatrin Weller
 
Critical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataCritical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataUniversity of South Africa (Unisa)
 
Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Nicola Osborne
 
Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...Bart Rienties
 
Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...Kandy Woodfield
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESMicah Altman
 
Big Data, Communities and Ethical Resilience: A Framework for Action
Big Data, Communities and Ethical Resilience: A Framework for ActionBig Data, Communities and Ethical Resilience: A Framework for Action
Big Data, Communities and Ethical Resilience: A Framework for ActionThe Rockefeller Foundation
 
Analíticas del aprendizaje: una perspectiva crítica
Analíticas del aprendizaje: una perspectiva críticaAnalíticas del aprendizaje: una perspectiva crítica
Analíticas del aprendizaje: una perspectiva críticaCENT
 
“Big data” in human services organisations: Practical problems and ethical di...
“Big data” in human services organisations: Practical problems and ethical di...“Big data” in human services organisations: Practical problems and ethical di...
“Big data” in human services organisations: Practical problems and ethical di...husITa
 

Similar to Accessing Big Data to Advance Social Sciences (20)

Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
Ethical and Legal Issues in Computational Social Science - Lecture 7 in Intro...
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...
 
Studying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & BiasStudying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & Bias
 
Grounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methodsGrounded theory meets big data: One way to marry ethnography and digital methods
Grounded theory meets big data: One way to marry ethnography and digital methods
 
Meyer Big Data SDP13
Meyer Big Data SDP13Meyer Big Data SDP13
Meyer Big Data SDP13
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social Sciences
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitter
 
Critical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataCritical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) data
 
Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...
 
Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...
 
Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
 
Big Data, Communities and Ethical Resilience: A Framework for Action
Big Data, Communities and Ethical Resilience: A Framework for ActionBig Data, Communities and Ethical Resilience: A Framework for Action
Big Data, Communities and Ethical Resilience: A Framework for Action
 
Analíticas del aprendizaje: una perspectiva crítica
Analíticas del aprendizaje: una perspectiva críticaAnalíticas del aprendizaje: una perspectiva crítica
Analíticas del aprendizaje: una perspectiva crítica
 
“Big data” in human services organisations: Practical problems and ethical di...
“Big data” in human services organisations: Practical problems and ethical di...“Big data” in human services organisations: Practical problems and ethical di...
“Big data” in human services organisations: Practical problems and ethical di...
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Accessing Big Data to Advance Social Sciences

  • 1. Accessing and Using Big Data to Advance Social Science Knowledge Eric Meyer, Ralph Schroeder, Linnet Taylor and Josh Cowls
  • 2. Overview • Project Introduction (Eric) – Papers – Conferences and outreach – Data sources • Big data theory, definition, limits to science and its uses (Ralph) • Case study: using big data to gauge public opinion (Josh)
  • 3. Accessing and Using Big Data to Advance Social Science Knowledge • October 2012 – August 2014 • Funded by the Alfred P. Sloan Foundation • Data sources • 120+ interviews, mainly with social scientists • Workshops and conferences • No representative sample, but some patterns of disciplinary and skills background and career trajectory
  • 4. Project scope to date • Economics • Business (Nemode project with Greg, Monica) • Wikipedia • Social Media • Development • Political Science • Causation and Theory • Policy • (20+ blog posts on a range of other issues)
  • 5. Outcomes and outreach • A series of workshops hosted at the Oxford Internet Institute: – 2013, March. Big Data: Rewards and Risks for the Social Sciences. Oxford Internet Institute, Oxford. http://www.oii.ox.ac.uk/events/?id=557 – 2013, March. Qualitative methods to study data practices: An international comparative workshop. Oxford Internet Institute, Oxford – 2013, January. Towards a Sociology of Data. Oxford Internet Institute, Oxford. http://www.oii.ox.ac.uk/events/?id=560 – 2014, May. Big data and social change in the developing world, Rockefeller Foundation Bellagio Centre conference – 2014, August. American Political Science Association Panel on ‘Big Data and Political Science’ • MSc option course ‘Big Data and Society’
  • 6. Definition • ‘Big data’ – the advance of knowledge via a leap in the scale and scope in relation to a given object or phenomenon • ‘Data’ – Belongs to the object – ‘taking…before interpreting’ (Ian Hacking) • the view that ‘all data are of their nature interpreted’ is misleading: ‘data are made, but as a good first approximation, the making and taking come before interpreting’ – The most atomizable useful unit of analysis
  • 7. Digital Objects and their Referents Digital Object (Examples: Twitter, Tesco Loyalty card information Real World (People / Physical Objects) Represent / Manipulate
  • 9. Uses and Limits • Big data research uses (academic, commercial, government) are limited to the exploitation of suitable objects, and the objects which ‘give off’ digital data, and the phenomena they lay bare, are limited • The knowledge produced is aimed at ‘sorting people’ and advancing ‘representing and intervening’ (but without ‘manipulating’, except where this is warranted by practical economic and political objectives) • Difference commercial versus academic world is that knowledge provides competitive and practical advantage as against advancing (high-consensus rapid-discovery) knowledge – The limits in both cases are the objects (to which the data ‘belong’), and that need to have available digitally manipulable data points • How available these objects are differs, but also… – Causation and theoretical embedding matters for academic social science – For commercial (and non-academic uses), ‘predicting’ consumer choices and other behaviours, for limited purposes and without increasing scientific knowledge, is good enough • There are many objects, for non-academics and scientists to humanities scholars (physical, human, cultural), but they are not infinite • This availability, not skills or other issues, determines the future of big data research
  • 10. (Big) data definition enables pinpointing impacts and threats • ‘Google Plus may not be much of a competitor to Facebook as a social network, but…some analysts…say that Google understands more about people’s social activity than Facebook does.’ – New York Times, 15.2. 2014, p. A1 ‘The Plus in Google Plus? It’s Mostly for Google’. • Facebook Likes: ‘Predicting users’ individual attributes and preferences can beused to improve numerous products and services. For instance, digital systems and devices (such as online stores or cars) could be designed to adjust their behavior to best fit each user’s inferred profile…online insurance…advertisements might emphasize security when facing emotionally unstable (neurotic) users but stress potential threats when dealing with emotionally stable ones’ – ‘Private traits and attributes are predictable from digital records of human behavior.’ Kosinski M, Stillwell D, Graepel T.,Proc Natl Acad Sci 2013 Apr 9;110(15):5802-5. • More powerful knowledge will enable better services, and more manipulation
  • 11. Ethical and Social Issues in Big Data Research • Objects with ‘total’ knowledge (universes) – Danger is inferring behaviour not of individuals, but of classes of people • Asymmetry of knower and the subjects of knowledge is greater than elsewhere • Based not on individuals’ but on aggregate behaviour – Hence only utilitarian, not Kantian justification? • Why does prediction or uncovering laws of behaviour ‘grate’? • Benefits: greater scientific power and more specific details • Relation to smaller data? ‘Creep’ • Solution: ethical = greater researcher and public awareness, regulatory (would apply to academic researchers?) = prevent legal and specific harms
  • 12. Outlook and Implications • There is an overlap between real world research and the world of academic research which is closer than elsewhere – because this is the research front in both – because they share common objects • For research – Develop theoretical frame in which to embed big data (for social media), including power/function, relation to traditional media, and role in society • For society – Awareness of how research can generate transparency and manipulability • Big Brother? – Yes, but also Brave New World of Omniscience, with Social Science as Handmaiden
  • 13. Exemplar case: using big data to gauge public opinion Opportunities and challenges
  • 14. The traditional model • The notion of public opinion was enlivened by the coffee houses of 17th Century Britain • Inferential statistics provided a rigorous, replicable basis for reporting public opinion, based on a random sample and MOE • Remained expensive, random sample difficult to construct, response rates dwindling
  • 15. Using big data approaches • n=all: beyond the sample • Cheaper (after initial investment) • More granularity, more insight? Cost Utility Traditional inferential model Big data model
  • 16. But challenges remain… • Representativeness • Reliability • Replicability
  • 17. The challenge of representativeness ≠ Amber Boydstun: anyone who does a Twitter study has to really work hard, I've noticed, to justify why we should care about Twitter because Twitter is not representative of the United States population or the world population, right. And it's not. And even if it was representative, or even if we don't care that it's not representative, it's really hard to figure out in any given study whether you're getting an over- sampling of those users who are just more active than other users. Data from Dutton, W.H. and Blank, G., with Groselj, D. (2013) Cultures of the Internet: The Internet in Britain. Oxford Internet Survey 2013. Oxford Internet Institute, University of Oxford.
  • 18. The challenge of reliability Mike Thelwall: really the big problem that we haven’t cracked is that if someone tweets a sentiment it’s not necessarily what they’re feeling, it can be for a variety of reasons, so it doesn’t really reflect directly what they feel necessarily … so it’s quite a stretch to say that if someone tweets, “I’m happy” that they’re actually happy, to give a simple example • Difficult to establish the meaning of latent messages • Platform specific behaviours (e.g. hashtags, likes) are not always understood • Political discourse often laced with sarcasm
  • 19. The challenge of replicability • Social data is often proprietary – getting access can be difficult, expensive or impossible • Sometimes access is limited to output – analysis takes inside black box • Challenges basic Popperian assumption of falsifiability Nick Anstead: there are all these companies that do all this wonderful stuff, but actually as an academic researcher, using them is expensive … what do you actually get from working with these companies? Do you get raw data sets that you go and do stuff with yourself? More commonly, I would suggest, what you probably get is access to, sort of, a black box tool.
  • 20. … and the implications aren’t just academic Shelton, T et al, ‘Mapping the data shadows of Hurricane Sandy: Uncovering the sociospatial dimensions of ‘big data’. Geoforum Volume 52, March 2014, pps 167-79
  • 21. Project Papers Schroeder, Ralph (Forthcoming). ‘Big Data: Towards a More Scientific Social Science and Humanities’ in Mark Graham and William H Dutton (eds.), Society and the Internet: How Networks of Information are Changing our Lives. Forthcoming. Schroeder, Ralph, & Taylor, Linnet (Forthcoming). ‘Is bigger better? The emergence of big data as a tool for international development policy.’ GeoJournal. Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet (2013, August). ‘Big Data in the Study of Twitter, Facebook and Wikipedia: On the Uses and Disadvantages of Scientificity for Social Research.’ Paper presented at the proceedings of the Annual Meeting of the American Sociological Association. (being submitted) Schroeder, Ralph, & Taylor, Linnet. ‘Big Data and Wikipedia Research: Social Science Knowledge across Disciplinary Divides’. Submitted to Information, Communication and Society. Taylor, Linnet. ‘No place to hide? The ethics and analytics of tracking mobility using African mobile phone data. Submitted to Population, Space and Place. Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet. ‘Big Data in the Social Sciences: Towards a New Research Paradigm?’ (being submitted). Meyer, Eric T., Schroeder, Ralph, & Taylor, Linnet (2013, November). ‘The Boundaries of Big Data.’ Paper presented at SIG-SI Symposium, ASIST 2013, November 1-6, 2013, Montreal, Quebec, Canada. Schroeder, Ralph and Cowls, Josh. ‘Answering Questions and Questioning Answers in the Era of Big Data.’ In preparation. Taylor, Linnet, Meyer, Eric T., & Schroeder, Ralph. ‘Bigger and better, or more of the same? Emerging practices and perspectives on big data analysis in economics”. Forthcoming in Big Data & Society. Cowls, Josh. ‘The Crowd in the Cloud?’, forthcoming presentation and IPP 2014’ Cowls, Josh ‘Big Data and Policy Implementation’, in preparation. Schroeder, Ralph ‘Big Data and Policy Implications’, in preparation.