SlideShare a Scribd company logo
1 of 22
Download to read offline
Big data, new epistemologies and paradigm shifts
or
Do revolutions in measurement lead to
revolutions in science?
Rob Kitchin,
National University of Ireland Maynooth
Introduction
• “Revolutions in science have often been preceded by revolutions
in measurement” Sinan Aral (2010)
• “Big data creates a radical shift in how we think about research.
... [It offers] a profound change at the levels of epistemology
and ethics. Big data reframes key questions about the
constitution of knowledge, the processes of research, how we
should engage with information, and the nature and the
categorization of reality ... Big data stakes out new terrains of
objects, methods of knowing, and definitions of social life”
(boyd and Crawford 2012)
• Critically examine
• Big data
• Data analytics
• Effects on epistemological and methodological approach in sciences,
social sciences and humanities
Small data / big data
Characteristic Small data Big data
Volume Limited to large Very large
Exhaustivity Samples Entire populations
Resolution and
indexicality
Coarse & weak to tight
& strong
Tight & strong
Relationality Weak to strong Strong
Velocity Slow, freeze-framed Fast
Variety Limited to wide Wide
Flexible and scalable Low to middling High
Urban big data
• Directed
o Surveillance: CCTV,
drones/satellite
o Scaled public admin records
• Automated
o Automated surveillance
o Digital devices
o Sensors, actuators,
transponders, meters (IoT)
o Interactions and transactions
• Volunteered
o Social media
o Sousveillance/wearables
o Crowdsourcing
o Citizen science
Big data analytics
• Challenge of making sense of big data is coping with its
abundance and exhaustivity, timeliness and dynamism,
messiness and uncertainty, semi-structured or unstructured
nature
• Solution has been machine learning (AI) made possible by
advances in computation and computational techniques
• Four broad classes of analytics:
• data mining and pattern recognition
• statistical analysis
• prediction, simulation, and optimization
• data visualization and visual analytics
New paradigms
• Big data, coupled with new data analytics, challenges established
epistemologies across the sciences, social sciences and humanities
• Transforming how we frame, ask and answer questions
• Some argue leading to new paradigms within and across disciplines
• For Kuhn (1962) paradigm shifts are driven by science being unable to account
for particular phenomena or answer key questions
• For Gray (2009) paradigm shifts are driven by new forms of measurement, data
and analytical techniques. He charts the evolution of science through four
broad paradigms
Paradigm Nature Form When
First Experimental science Empiricism; describing natural
phenomena
pre-Renaissance
Second Theoretical science Modelling and generalization pre-computers
Third Computational science Simulation of complex phenomena pre-big data
Fourth Exploratory science Data-intensive; statistical exploration
and data mining
Now
Science
• Gray proposes that science is entering a fourth paradigm
driven by big data and new data analytics
• Leading to new era of data-intensive science and a
radically new extension of the established scientific
method
• Others suggest that big data ushers in a new era of
empiricism, wherein data can speak for themselves free of
theory
• The latter has gain credence outside of the academy,
especially within business circles, but its ideas have also
taken root in data science
‘The end of theory’
• Anderson (2008) argues: ‘The data deluge makes the scientific method
obsolete’; that the patterns and relationships contained within big data
inherently produce meaningful and insightful knowledge
• “There is now a better way. Petabytes allow us to say: ‘Correlation is
enough.’ ... We can analyze the data without hypotheses about what it
might show. We can throw the numbers into the biggest computing
clusters the world has ever seen and let statistical algorithms find
patterns where science cannot. ... Correlation supersedes causation,
and science can advance even without coherent models, unified
theories, or really any mechanistic explanation at all. There’s no
reason to cling to our old ways.”
• Ayasdi software claims to be able to:
• “automatically discover insights -- regardless of complexity -- without
asking questions.”
‘The end of theory’
• Moreover, can employ an ensemble approach
• Literally hundreds of different algorithms can be applied to
a dataset to determine the best answer or a composite
model or explanation
• A radically different approach to that traditionally used
wherein the analyst selects an appropriate method based
on their knowledge of techniques and the data
• Logic is insight is born from the data, not theory
‘The end of theory’
• Powerful and attractive set of ideas at work in the empiricist epistemology that
run counter to mainstream deductive approach:
• big data can capture a whole of a domain and provide full resolution
• there is no need for a priori theory, models or hypotheses
• through the application of agnostic data analytics the data can speak for
themselves free of human bias or framing
• that any patterns and relationships within big data are inherently
meaningful and truthful
• meaning transcends context or domain-specific knowledge, thus can be
interpreted by anyone who can decode a statistic or data visualization
• offers the possibility of insightful, objective and profitable knowledge
without science or scientists
• These work together to suggest that a new mode of understanding the world is
being created, one in which the modus operandi is purely inductive in nature
‘The end of theory’
• Empiricist thinking is problematic for four
reasons:
• Big data are both a representation and a sample, shaped
by the technology and platform used, the data ontology
employed, the regulatory environment, and are subject
to sampling bias
• Big data do not arise from nowhere, free from the ‘the
regulating force of philosophy’
• Big data cannot simply speak for themselves free of
human bias or framing
• Big data cannot be interpreted outside of context and
domain-specific knowledge
Data-driven science
• Data-driven science seeks to hold to the tenets of the scientific
method, but is more open to using a hybrid combination of
abductive, inductive and deductive approaches
• Differs from traditional, experimental deductive design in that it
seeks to generate hypotheses and insights ‘born from the data’
rather than ‘born from the theory’
• Seeks to incorporate a mode of induction into the research
design, though explanation through induction is not the intended
end-point.
• Instead, induction forms a new mode of hypothesis generation
before a deductive approach is employed
• Process of induction does not arise from nowhere, but is situated
and contextualised within a highly evolved theoretical domain
Data-driven science
• The epistemological strategy is to use guide knowledge discovery
techniques to identify potential questions worthy of further
examination and testing
• And instead of testing whether every relationship revealed has
veracity, attention is focused on those that seemingly offer the
most likely or valid way forward based on established science
• Approach is suited to extracting additional, valuable insights that
traditional ‘knowledge-driven science’ would fail to generate
• Data-driven approached:
• suited to exploring, extracting value and making sense of massive,
interconnected data sets
• fostering interdisciplinary research that conjoins domain expertise
• will lead to more holistic and extensive models and theories of
entire complex systems rather than elements of them
Social sciences and humanities
• The effect of big data/data analytics in the humanities and
social sciences is less certain
• These areas of scholarship are highly diverse in their
philosophical underpinnings, with only some scholars
employing the epistemology common in the sciences
• Whilst there is a history quantitative and positivistic
scholarship in social sciences, much rarer in humanities
• There has been a strong post-positivistic shift in many
social science disciplines
Computational social science
• For positivistic scholars in the social sciences, big data offers the
opportunity to develop more sophisticated, wider-scale, finer-
grained models of human life. To shift from:
• data-scarce to data-rich studies of societies
• from static snapshots to dynamic unfoldings
• from coarse aggregations to high resolutions
• from relatively simple models to more complex, sophisticated
simulations
• The potential is for studies with much greater breadth, depth,
scale, and timeliness, and are inherently longitudinal
• The variety, exhaustivity, resolution, and relationality of data,
plus new techniques, addresses some of the critiques of
positivistic scholarship –- reductionism and universalism -- by
providing more finely grained, sensitive, and nuanced analysis
Social sciences
• For post-positivist scholars, big data offers both opportunities and challenges
• Opportunities:
• a proliferation, digitisation and interlinking of a diverse set of analogue and
unstructured data, much of it new (e.g., social media) and many of which have
been difficult to access (e.g., millions of books, documents, newspapers,
photographs, art works, material objects, etc.)
• And new tools of data curation, management and analysis that can handle
massive numbers of data objects
• Challenges:
• Analysis mechanistic, atomizing, and parochial, reducing diverse individuals and
complex multidimensional social structures to mere data points; identifies
trends but not what produces such a trend
• struggles with the social and with context
• creates bigger haystacks
• identifies but does not address problems
• tends to marginalize metaphysical and normative questions
• erosion of domain level expertise
• promotion of empiricist/quantitative approaches and skewing of funding
towards big data
• skills and knowledge deficit
Digital humanities
• Opportunities/challenges being keenly felt in the
humanities; rise of digital humanities
• Rather than providing a close reading of a handful of
novels or photographs, or a couple of artists and their
work, it becomes possible to search, connect and find
patterns across a very large number of related works
• Digital humanities advocates broadly divided into two
camps epistemologically
• Those that believe that that new techniques -- counting,
graphing, mapping, data mining -- bring methodological rigour
and objectivity to disciplines that heretofore been
unsystematic and random in their focus and approach
• Those that see the techniques as a supplement to, rather than
replacement for existing humanities methods and theory
building
Digital humanities
• Both cases tend to use descriptive rather than inferential
statistics
• The claims of the former have opened up an
epistemological debate centred on close versus distant
reading/interpretation, ability of algorithms to parse
meaning and context
• DH seen by some as mechanistic and reductionist
(reduces literature and art to data)
• Identifies patterns but not processes or meaning
• Sacrifices complexity, specificity, context, depth and
critique for scale, breadth, automation, descriptive
patterns and the impression that interpretation does not
require deep contextual knowledge
• Other similar concerns as Soc Sci.
What happens to small data studies?
• Big data doesn’t replace or negate small data
• Small data have a proven track record of answering specific
questions, with est. procedures, methods, etc.
• Studies can be much more finely tailored
• Small data studies seek to mine gold from carefully working a
narrow seam, whereas big data studies seek to extract nuggets
through open-pit mining, scooping up and sieving huge tracts of
land
• Small data will, however, increasingly be made more big data-
like through the development of new data infrastructures that:
• pool, scale and link small data in order to create larger datasets,
• encourage sharing and re-use
• open them up to combination with big data and analysis using big
data analytics
Conclusion
• Big data/analytics constitute a data revolution – fundamentally
alters the nature of data and how we make sense of them
(disruptive innovation)
• It is starting to transform how research is conducted, organised
and managed - enables new approaches to data
generation/analysis that make it possible to ask and answer
questions in new ways
• Also pose significant social, political and ethical questions
• As new technologies and analytics develop these transformations
will extend and deepen raising a series of conceptual and
methodological challenges across sciences, social sciences and
humanities
• Have the potential to usher in new paradigms, but more likely to
be further pluralism in approaches
Rob.Kitchin@nuim.ie
@robkitchin
Kitchin, R. and McArdle, G. (2016) What makes big data, big data? Exploring the ontological
characteristics of 26 datasets. Big Data & Society 3: 1–10
Kitchin, R. and Lauriault, T. (2014) Towards critical data studies. SSRN
Kitchin R and Lauriault T (2015) Small data in the era of big data. GeoJournal 80(4): 463-475
Kitchin R (2014) Big data, new epistemologies & paradigm shifts. Big Data and Society 1: 1-12.
Kitchin, R. (2014) The real-time city? Big data and smart urbanism. GeoJournal 79(1): 1-14.
Kitchin, R. (2013) Big data and human geography: Opportunities, challenges and risks.
Dialogues in Human Geography 3(3): 262–267
http://www.nuim.ie/progcity
@progcity

More Related Content

Viewers also liked

Data-driven urbanism (Amsterdam, Jan 2017)
Data-driven urbanism (Amsterdam, Jan 2017)Data-driven urbanism (Amsterdam, Jan 2017)
Data-driven urbanism (Amsterdam, Jan 2017)robkitchin
 
The ethics of urban big data and smart cities
The ethics of urban big data and smart citiesThe ethics of urban big data and smart cities
The ethics of urban big data and smart citiesrobkitchin
 
Ethics and Politics of Big Data
Ethics and Politics of Big DataEthics and Politics of Big Data
Ethics and Politics of Big Datarobkitchin
 
Smart cities: realising the promises while minimizing the perils
Smart cities: realising the promises while minimizing the perilsSmart cities: realising the promises while minimizing the perils
Smart cities: realising the promises while minimizing the perilsrobkitchin
 
The ethics and risks of urban big data and smart cities
The ethics and risks of urban big data and smart citiesThe ethics and risks of urban big data and smart cities
The ethics and risks of urban big data and smart citiesrobkitchin
 
Critical data studies
Critical data studiesCritical data studies
Critical data studiesrobkitchin
 
Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)
Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)
Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)Mainard Gallagher
 
Smart cities, big data & their consequences
Smart cities, big data & their consequencesSmart cities, big data & their consequences
Smart cities, big data & their consequencesrobkitchin
 
Open data: an open and shut case?
Open data: an open and shut case?Open data: an open and shut case?
Open data: an open and shut case?robkitchin
 
Urban indicators, city benchmarking, and real time dashboards: Knowing and go...
Urban indicators, city benchmarking, and real time dashboards: Knowing and go...Urban indicators, city benchmarking, and real time dashboards: Knowing and go...
Urban indicators, city benchmarking, and real time dashboards: Knowing and go...robkitchin
 
02 Analysis of Algorithms: Divide and Conquer
02 Analysis of Algorithms: Divide and Conquer02 Analysis of Algorithms: Divide and Conquer
02 Analysis of Algorithms: Divide and ConquerAndres Mendez-Vazquez
 
Code acts in code/space
Code acts in code/spaceCode acts in code/space
Code acts in code/spacerobkitchin
 
The Real-Time City? Data-driven, networked urbanism and the production of sm...
The Real-Time City? Data-driven, networked urbanism  and the production of sm...The Real-Time City? Data-driven, networked urbanism  and the production of sm...
The Real-Time City? Data-driven, networked urbanism and the production of sm...robkitchin
 
The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...
The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...
The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...robkitchin
 
13 genetic algorithms
13 genetic algorithms13 genetic algorithms
13 genetic algorithmsNidul Sinha
 
AI ch6
AI ch6AI ch6
AI ch6Mhd Sb
 
Privacy in a digital world
Privacy in a digital worldPrivacy in a digital world
Privacy in a digital worldrobkitchin
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboardrobkitchin
 

Viewers also liked (20)

Data-driven urbanism (Amsterdam, Jan 2017)
Data-driven urbanism (Amsterdam, Jan 2017)Data-driven urbanism (Amsterdam, Jan 2017)
Data-driven urbanism (Amsterdam, Jan 2017)
 
The ethics of urban big data and smart cities
The ethics of urban big data and smart citiesThe ethics of urban big data and smart cities
The ethics of urban big data and smart cities
 
Ethics and Politics of Big Data
Ethics and Politics of Big DataEthics and Politics of Big Data
Ethics and Politics of Big Data
 
Smart cities: realising the promises while minimizing the perils
Smart cities: realising the promises while minimizing the perilsSmart cities: realising the promises while minimizing the perils
Smart cities: realising the promises while minimizing the perils
 
The ethics and risks of urban big data and smart cities
The ethics and risks of urban big data and smart citiesThe ethics and risks of urban big data and smart cities
The ethics and risks of urban big data and smart cities
 
Big Data Analytics, Giovanni Delussu e Marco Enrico Piras
 Big Data Analytics, Giovanni Delussu e Marco Enrico Piras  Big Data Analytics, Giovanni Delussu e Marco Enrico Piras
Big Data Analytics, Giovanni Delussu e Marco Enrico Piras
 
Critical data studies
Critical data studiesCritical data studies
Critical data studies
 
Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)
Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)
Rob Kitchin Smart Cities 08th March 2016 (Smart Dublin)
 
Smart cities, big data & their consequences
Smart cities, big data & their consequencesSmart cities, big data & their consequences
Smart cities, big data & their consequences
 
Open data: an open and shut case?
Open data: an open and shut case?Open data: an open and shut case?
Open data: an open and shut case?
 
Urban indicators, city benchmarking, and real time dashboards: Knowing and go...
Urban indicators, city benchmarking, and real time dashboards: Knowing and go...Urban indicators, city benchmarking, and real time dashboards: Knowing and go...
Urban indicators, city benchmarking, and real time dashboards: Knowing and go...
 
02 Analysis of Algorithms: Divide and Conquer
02 Analysis of Algorithms: Divide and Conquer02 Analysis of Algorithms: Divide and Conquer
02 Analysis of Algorithms: Divide and Conquer
 
Code acts in code/space
Code acts in code/spaceCode acts in code/space
Code acts in code/space
 
The Real-Time City? Data-driven, networked urbanism and the production of sm...
The Real-Time City? Data-driven, networked urbanism  and the production of sm...The Real-Time City? Data-driven, networked urbanism  and the production of sm...
The Real-Time City? Data-driven, networked urbanism and the production of sm...
 
The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...
The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...
The Impact of the Data Revolution on Official Statistics: Opportunities, Chal...
 
13 genetic algorithms
13 genetic algorithms13 genetic algorithms
13 genetic algorithms
 
AI ch6
AI ch6AI ch6
AI ch6
 
Privacy in a digital world
Privacy in a digital worldPrivacy in a digital world
Privacy in a digital world
 
01 intro1
01 intro101 intro1
01 intro1
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboard
 

Similar to Big data, new epistemologies and paradigm shifts

IT3010 Lecture-7 Research Paradigms
IT3010 Lecture-7 Research ParadigmsIT3010 Lecture-7 Research Paradigms
IT3010 Lecture-7 Research ParadigmsBabakFarshchian
 
CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...Johann van Wyk
 
Grounded Theory Qualitative Research MethodPpt
Grounded Theory Qualitative Research MethodPptGrounded Theory Qualitative Research MethodPpt
Grounded Theory Qualitative Research MethodPptQuratulain974670
 
qualitative research.pptx
qualitative research.pptxqualitative research.pptx
qualitative research.pptxMahbubur3
 
paradigms-190305093939 (1).pdf
paradigms-190305093939 (1).pdfparadigms-190305093939 (1).pdf
paradigms-190305093939 (1).pdfssuser31c469
 
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier
 
Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)Hora Tjitra
 
362248809-Grounded-theory-ppt.ppt
362248809-Grounded-theory-ppt.ppt362248809-Grounded-theory-ppt.ppt
362248809-Grounded-theory-ppt.pptARVIDREELOMERA
 
Introduction to quantitative and qualitative research
Introduction to quantitative and qualitative researchIntroduction to quantitative and qualitative research
Introduction to quantitative and qualitative researchLiz FitzGerald
 
Internet Research Ethics CSSWS2015 Tutorial
Internet Research Ethics CSSWS2015 TutorialInternet Research Ethics CSSWS2015 Tutorial
Internet Research Ethics CSSWS2015 TutorialKa_Kinder
 
AoA Presentation.v.6Feb2024.pptx
AoA Presentation.v.6Feb2024.pptxAoA Presentation.v.6Feb2024.pptx
AoA Presentation.v.6Feb2024.pptxklaus110316
 
AAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveysAAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveysCliff Lampe
 

Similar to Big data, new epistemologies and paradigm shifts (20)

IT3010 Lecture-7 Research Paradigms
IT3010 Lecture-7 Research ParadigmsIT3010 Lecture-7 Research Paradigms
IT3010 Lecture-7 Research Paradigms
 
CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...
 
Grounded Theory Qualitative Research MethodPpt
Grounded Theory Qualitative Research MethodPptGrounded Theory Qualitative Research MethodPpt
Grounded Theory Qualitative Research MethodPpt
 
qualitative research.pptx
qualitative research.pptxqualitative research.pptx
qualitative research.pptx
 
paradigms-190305093939 (1).pdf
paradigms-190305093939 (1).pdfparadigms-190305093939 (1).pdf
paradigms-190305093939 (1).pdf
 
Paradigms
ParadigmsParadigms
Paradigms
 
Grounded theory
Grounded theoryGrounded theory
Grounded theory
 
Grounded theory
Grounded theoryGrounded theory
Grounded theory
 
Grounded theory
Grounded theoryGrounded theory
Grounded theory
 
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona
 
00952
0095200952
00952
 
Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)Grounded Theory: an Introduction (updated Jan 2011)
Grounded Theory: an Introduction (updated Jan 2011)
 
362248809-Grounded-theory-ppt.ppt
362248809-Grounded-theory-ppt.ppt362248809-Grounded-theory-ppt.ppt
362248809-Grounded-theory-ppt.ppt
 
intro-qual-quant.pptx
intro-qual-quant.pptxintro-qual-quant.pptx
intro-qual-quant.pptx
 
intro-qual-quant.pptx
intro-qual-quant.pptxintro-qual-quant.pptx
intro-qual-quant.pptx
 
Introduction to quantitative and qualitative research
Introduction to quantitative and qualitative researchIntroduction to quantitative and qualitative research
Introduction to quantitative and qualitative research
 
intro-qual-quant.pptx
intro-qual-quant.pptxintro-qual-quant.pptx
intro-qual-quant.pptx
 
Internet Research Ethics CSSWS2015 Tutorial
Internet Research Ethics CSSWS2015 TutorialInternet Research Ethics CSSWS2015 Tutorial
Internet Research Ethics CSSWS2015 Tutorial
 
AoA Presentation.v.6Feb2024.pptx
AoA Presentation.v.6Feb2024.pptxAoA Presentation.v.6Feb2024.pptx
AoA Presentation.v.6Feb2024.pptx
 
AAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveysAAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveys
 

More from robkitchin

The right to the smart city
The right to the smart cityThe right to the smart city
The right to the smart cityrobkitchin
 
Adoption gap issues in smart cities
Adoption gap issues in smart citiesAdoption gap issues in smart cities
Adoption gap issues in smart citiesrobkitchin
 
Citizenship, social justice, and the Right to the Smart City
Citizenship, social justice, and the Right to the Smart CityCitizenship, social justice, and the Right to the Smart City
Citizenship, social justice, and the Right to the Smart Cityrobkitchin
 
Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...
Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...
Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...robkitchin
 
Planning in an era of smart urbanism
Planning in an era of smart urbanismPlanning in an era of smart urbanism
Planning in an era of smart urbanismrobkitchin
 
Why the National Spatial Strategy failed and prospects for the National Plann...
Why the National Spatial Strategy failed and prospects for the National Plann...Why the National Spatial Strategy failed and prospects for the National Plann...
Why the National Spatial Strategy failed and prospects for the National Plann...robkitchin
 
Big data and smart cities: Key data issues
Big data and smart cities: Key data issuesBig data and smart cities: Key data issues
Big data and smart cities: Key data issuesrobkitchin
 
Funding models for open access digital repositories
Funding models for open access digital repositoriesFunding models for open access digital repositories
Funding models for open access digital repositoriesrobkitchin
 
Housing in Ireland: From Crisis to Crisis
Housing in Ireland: From Crisis to CrisisHousing in Ireland: From Crisis to Crisis
Housing in Ireland: From Crisis to Crisisrobkitchin
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launchrobkitchin
 
The crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and mapsThe crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and mapsrobkitchin
 

More from robkitchin (11)

The right to the smart city
The right to the smart cityThe right to the smart city
The right to the smart city
 
Adoption gap issues in smart cities
Adoption gap issues in smart citiesAdoption gap issues in smart cities
Adoption gap issues in smart cities
 
Citizenship, social justice, and the Right to the Smart City
Citizenship, social justice, and the Right to the Smart CityCitizenship, social justice, and the Right to the Smart City
Citizenship, social justice, and the Right to the Smart City
 
Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...
Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...
Being a ‘citizen’ in the smart city: Up and down the scaffold of smart citize...
 
Planning in an era of smart urbanism
Planning in an era of smart urbanismPlanning in an era of smart urbanism
Planning in an era of smart urbanism
 
Why the National Spatial Strategy failed and prospects for the National Plann...
Why the National Spatial Strategy failed and prospects for the National Plann...Why the National Spatial Strategy failed and prospects for the National Plann...
Why the National Spatial Strategy failed and prospects for the National Plann...
 
Big data and smart cities: Key data issues
Big data and smart cities: Key data issuesBig data and smart cities: Key data issues
Big data and smart cities: Key data issues
 
Funding models for open access digital repositories
Funding models for open access digital repositoriesFunding models for open access digital repositories
Funding models for open access digital repositories
 
Housing in Ireland: From Crisis to Crisis
Housing in Ireland: From Crisis to CrisisHousing in Ireland: From Crisis to Crisis
Housing in Ireland: From Crisis to Crisis
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launch
 
The crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and mapsThe crisis in Ireland in graphs and maps
The crisis in Ireland in graphs and maps
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

Big data, new epistemologies and paradigm shifts

  • 1. Big data, new epistemologies and paradigm shifts or Do revolutions in measurement lead to revolutions in science? Rob Kitchin, National University of Ireland Maynooth
  • 2. Introduction • “Revolutions in science have often been preceded by revolutions in measurement” Sinan Aral (2010) • “Big data creates a radical shift in how we think about research. ... [It offers] a profound change at the levels of epistemology and ethics. Big data reframes key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and the categorization of reality ... Big data stakes out new terrains of objects, methods of knowing, and definitions of social life” (boyd and Crawford 2012) • Critically examine • Big data • Data analytics • Effects on epistemological and methodological approach in sciences, social sciences and humanities
  • 3. Small data / big data Characteristic Small data Big data Volume Limited to large Very large Exhaustivity Samples Entire populations Resolution and indexicality Coarse & weak to tight & strong Tight & strong Relationality Weak to strong Strong Velocity Slow, freeze-framed Fast Variety Limited to wide Wide Flexible and scalable Low to middling High
  • 4. Urban big data • Directed o Surveillance: CCTV, drones/satellite o Scaled public admin records • Automated o Automated surveillance o Digital devices o Sensors, actuators, transponders, meters (IoT) o Interactions and transactions • Volunteered o Social media o Sousveillance/wearables o Crowdsourcing o Citizen science
  • 5. Big data analytics • Challenge of making sense of big data is coping with its abundance and exhaustivity, timeliness and dynamism, messiness and uncertainty, semi-structured or unstructured nature • Solution has been machine learning (AI) made possible by advances in computation and computational techniques • Four broad classes of analytics: • data mining and pattern recognition • statistical analysis • prediction, simulation, and optimization • data visualization and visual analytics
  • 6.
  • 7. New paradigms • Big data, coupled with new data analytics, challenges established epistemologies across the sciences, social sciences and humanities • Transforming how we frame, ask and answer questions • Some argue leading to new paradigms within and across disciplines • For Kuhn (1962) paradigm shifts are driven by science being unable to account for particular phenomena or answer key questions • For Gray (2009) paradigm shifts are driven by new forms of measurement, data and analytical techniques. He charts the evolution of science through four broad paradigms Paradigm Nature Form When First Experimental science Empiricism; describing natural phenomena pre-Renaissance Second Theoretical science Modelling and generalization pre-computers Third Computational science Simulation of complex phenomena pre-big data Fourth Exploratory science Data-intensive; statistical exploration and data mining Now
  • 8. Science • Gray proposes that science is entering a fourth paradigm driven by big data and new data analytics • Leading to new era of data-intensive science and a radically new extension of the established scientific method • Others suggest that big data ushers in a new era of empiricism, wherein data can speak for themselves free of theory • The latter has gain credence outside of the academy, especially within business circles, but its ideas have also taken root in data science
  • 9. ‘The end of theory’ • Anderson (2008) argues: ‘The data deluge makes the scientific method obsolete’; that the patterns and relationships contained within big data inherently produce meaningful and insightful knowledge • “There is now a better way. Petabytes allow us to say: ‘Correlation is enough.’ ... We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot. ... Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. There’s no reason to cling to our old ways.” • Ayasdi software claims to be able to: • “automatically discover insights -- regardless of complexity -- without asking questions.”
  • 10. ‘The end of theory’ • Moreover, can employ an ensemble approach • Literally hundreds of different algorithms can be applied to a dataset to determine the best answer or a composite model or explanation • A radically different approach to that traditionally used wherein the analyst selects an appropriate method based on their knowledge of techniques and the data • Logic is insight is born from the data, not theory
  • 11. ‘The end of theory’ • Powerful and attractive set of ideas at work in the empiricist epistemology that run counter to mainstream deductive approach: • big data can capture a whole of a domain and provide full resolution • there is no need for a priori theory, models or hypotheses • through the application of agnostic data analytics the data can speak for themselves free of human bias or framing • that any patterns and relationships within big data are inherently meaningful and truthful • meaning transcends context or domain-specific knowledge, thus can be interpreted by anyone who can decode a statistic or data visualization • offers the possibility of insightful, objective and profitable knowledge without science or scientists • These work together to suggest that a new mode of understanding the world is being created, one in which the modus operandi is purely inductive in nature
  • 12. ‘The end of theory’ • Empiricist thinking is problematic for four reasons: • Big data are both a representation and a sample, shaped by the technology and platform used, the data ontology employed, the regulatory environment, and are subject to sampling bias • Big data do not arise from nowhere, free from the ‘the regulating force of philosophy’ • Big data cannot simply speak for themselves free of human bias or framing • Big data cannot be interpreted outside of context and domain-specific knowledge
  • 13. Data-driven science • Data-driven science seeks to hold to the tenets of the scientific method, but is more open to using a hybrid combination of abductive, inductive and deductive approaches • Differs from traditional, experimental deductive design in that it seeks to generate hypotheses and insights ‘born from the data’ rather than ‘born from the theory’ • Seeks to incorporate a mode of induction into the research design, though explanation through induction is not the intended end-point. • Instead, induction forms a new mode of hypothesis generation before a deductive approach is employed • Process of induction does not arise from nowhere, but is situated and contextualised within a highly evolved theoretical domain
  • 14. Data-driven science • The epistemological strategy is to use guide knowledge discovery techniques to identify potential questions worthy of further examination and testing • And instead of testing whether every relationship revealed has veracity, attention is focused on those that seemingly offer the most likely or valid way forward based on established science • Approach is suited to extracting additional, valuable insights that traditional ‘knowledge-driven science’ would fail to generate • Data-driven approached: • suited to exploring, extracting value and making sense of massive, interconnected data sets • fostering interdisciplinary research that conjoins domain expertise • will lead to more holistic and extensive models and theories of entire complex systems rather than elements of them
  • 15. Social sciences and humanities • The effect of big data/data analytics in the humanities and social sciences is less certain • These areas of scholarship are highly diverse in their philosophical underpinnings, with only some scholars employing the epistemology common in the sciences • Whilst there is a history quantitative and positivistic scholarship in social sciences, much rarer in humanities • There has been a strong post-positivistic shift in many social science disciplines
  • 16. Computational social science • For positivistic scholars in the social sciences, big data offers the opportunity to develop more sophisticated, wider-scale, finer- grained models of human life. To shift from: • data-scarce to data-rich studies of societies • from static snapshots to dynamic unfoldings • from coarse aggregations to high resolutions • from relatively simple models to more complex, sophisticated simulations • The potential is for studies with much greater breadth, depth, scale, and timeliness, and are inherently longitudinal • The variety, exhaustivity, resolution, and relationality of data, plus new techniques, addresses some of the critiques of positivistic scholarship –- reductionism and universalism -- by providing more finely grained, sensitive, and nuanced analysis
  • 17. Social sciences • For post-positivist scholars, big data offers both opportunities and challenges • Opportunities: • a proliferation, digitisation and interlinking of a diverse set of analogue and unstructured data, much of it new (e.g., social media) and many of which have been difficult to access (e.g., millions of books, documents, newspapers, photographs, art works, material objects, etc.) • And new tools of data curation, management and analysis that can handle massive numbers of data objects • Challenges: • Analysis mechanistic, atomizing, and parochial, reducing diverse individuals and complex multidimensional social structures to mere data points; identifies trends but not what produces such a trend • struggles with the social and with context • creates bigger haystacks • identifies but does not address problems • tends to marginalize metaphysical and normative questions • erosion of domain level expertise • promotion of empiricist/quantitative approaches and skewing of funding towards big data • skills and knowledge deficit
  • 18. Digital humanities • Opportunities/challenges being keenly felt in the humanities; rise of digital humanities • Rather than providing a close reading of a handful of novels or photographs, or a couple of artists and their work, it becomes possible to search, connect and find patterns across a very large number of related works • Digital humanities advocates broadly divided into two camps epistemologically • Those that believe that that new techniques -- counting, graphing, mapping, data mining -- bring methodological rigour and objectivity to disciplines that heretofore been unsystematic and random in their focus and approach • Those that see the techniques as a supplement to, rather than replacement for existing humanities methods and theory building
  • 19. Digital humanities • Both cases tend to use descriptive rather than inferential statistics • The claims of the former have opened up an epistemological debate centred on close versus distant reading/interpretation, ability of algorithms to parse meaning and context • DH seen by some as mechanistic and reductionist (reduces literature and art to data) • Identifies patterns but not processes or meaning • Sacrifices complexity, specificity, context, depth and critique for scale, breadth, automation, descriptive patterns and the impression that interpretation does not require deep contextual knowledge • Other similar concerns as Soc Sci.
  • 20. What happens to small data studies? • Big data doesn’t replace or negate small data • Small data have a proven track record of answering specific questions, with est. procedures, methods, etc. • Studies can be much more finely tailored • Small data studies seek to mine gold from carefully working a narrow seam, whereas big data studies seek to extract nuggets through open-pit mining, scooping up and sieving huge tracts of land • Small data will, however, increasingly be made more big data- like through the development of new data infrastructures that: • pool, scale and link small data in order to create larger datasets, • encourage sharing and re-use • open them up to combination with big data and analysis using big data analytics
  • 21. Conclusion • Big data/analytics constitute a data revolution – fundamentally alters the nature of data and how we make sense of them (disruptive innovation) • It is starting to transform how research is conducted, organised and managed - enables new approaches to data generation/analysis that make it possible to ask and answer questions in new ways • Also pose significant social, political and ethical questions • As new technologies and analytics develop these transformations will extend and deepen raising a series of conceptual and methodological challenges across sciences, social sciences and humanities • Have the potential to usher in new paradigms, but more likely to be further pluralism in approaches
  • 22. Rob.Kitchin@nuim.ie @robkitchin Kitchin, R. and McArdle, G. (2016) What makes big data, big data? Exploring the ontological characteristics of 26 datasets. Big Data & Society 3: 1–10 Kitchin, R. and Lauriault, T. (2014) Towards critical data studies. SSRN Kitchin R and Lauriault T (2015) Small data in the era of big data. GeoJournal 80(4): 463-475 Kitchin R (2014) Big data, new epistemologies & paradigm shifts. Big Data and Society 1: 1-12. Kitchin, R. (2014) The real-time city? Big data and smart urbanism. GeoJournal 79(1): 1-14. Kitchin, R. (2013) Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography 3(3): 262–267 http://www.nuim.ie/progcity @progcity