Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Tracey P. Lauriault
@TraceyLauriault
Tracey.Lauriault@NUIM.ie
http://www.maynoothuniversity.ie/progcity/
NIRSA, Maynooth U...
Plan
1. 4 readings
2. Brainstorm and discuss commonalities and
outliers
3. Brainstorm & discuss each paper – definitions,
...
Readings:
Mark Graham and Taylor Shelton, 2013, Geography and the future of big data, big data
and the future of geography...
Commonalities
• Black boxed algorithms
• Predictive governance / predictive
categories / pre crime/ technological
agency / data dictator...
3 of the 4 papers mentioned
these documents
http://archive.wired.com/science/discoveries/maga
zine/16-07/pb_theory
“There'...
All 4 papers include one of the
other of these
http://dhg.sagepub.com/content/3/3/262.abstract (2013) http://mitpress.mit....
1. Graham & Shelton, 2013, Geography and the
future of big data, big data and the future of
geography
1. Graham & Shelton, 2013, Geography and the
future of big data, big data and the future of
geography
• Big Data Character...
Graham & Shelton, 2013
• Conclusion
• Exposed the promises and perils of big data
• demonstrated the discursive power of b...
2. Kitchin & Lauriault, 2014, Small data in the era
of big data
2. Kitchin & Lauriault, 2014, Small data in the era
of big data
• Growth
• Development of tek, infrastructure, techniques,...
Kitchin & Lauriault, 2014
• Issues
• Big data become more important than
small data
• “Small data mine gold from working o...
Kitchin & Lauriault
• Conclusions
• Small data will continue to be vital, big and small data
will be complementary, small ...
Comparing Small & Big Data
Characteristics Small Data Big Data Attributes of Big Data
Volume
Limited to
large
Very large T...
3. Miller & Goodchild, 2014, Data-driven geography
3. Miller & Goodchild, 2014, Data-driven geography
• Big Data Characteristics
• Volume
• Velocity
• Veracity
• Data captur...
Miller & Goodchild, 2014
• Big questions
• Are theory and
explanation archaic?
• Does data velocity matter?
• Can lack of ...
Miller & Goodchild
• Conclusion
• Most fundamental changes are variety and velocity
in data
• Old issues in new clothes – ...
4. Uprichard, Burrows & Parker, 2009,
Geodemographic code and the production of space
4. Uprichard, Burrows & Parker, 2009,
Geodemographic code and the production of space
• Geodemographic classifications:
• ...
Uprichard, Burrows & Parker, 2009
• Big Questions:
• How code is instantiated,
materialised and constructed
via code/space...
Uprichard, Burrows & Parker
• Conclusion
• If posthegemonic power are algorithmic, and if
algorithms are fundamental to th...
URLS
• http://www.ethanzuckerman.com/blog/2008/1
2/26/mapping-infrastructure-and-flow/
• http://atlas.gcrc.carleton.ca/hom...
Exercise
Characteristics
Small
Data
Big
Data
Census Sensors
Remote
Sensing
Social
Media
Other
Volume
Limited
to large
Very...
The Programmable City
• A European Research Council (ERC) and
Science Foundation of Ireland (SFI) funding
• SH3: Environme...
MIT Press 2011 Sage 2014
Aim of the ERC
project is to build
off and extend a
decade of work that
culminated in
Code/Space ...
Objective
• to provide:
• an interdisciplinary analysis of the two core
inter-related aspects of the emerging
programmable...
Objectives
How is the city translated into software and data?
How do software and data reshape the city?
Translation:
City...
ProgCity Research Matrix
Translation:
City into code
Transduction:
Code reshapes city
Understanding
the city
(Knowledge)
H...
Kitchin’s Data Assemblage
Attributes Elements
Systems of
thought
Modes of thinking, philosophies, theories, models,
ideolo...
Locations
• Dublin (Primary City)
• Boston (Secondary City)
• Ottawa/Montreal (Open Data Case Studies)
The Dublin Dashboard includes:
• real-time information
• time-series indicator data
• & interactive maps about all aspects...
Readings:
Mark Graham and Taylor Shelton, 2013, Geography and the future of big data, big data and the future of
geography...
Upcoming SlideShare
Loading in …5
×

Big Data

563 views

Published on

GY610
Mapping, GIS and Critical Spatial Data
Week 10: Session 10:
Thursday 2.00 – 4.00 pm
Geography Library; Physical Geography Lab
National University of Ireland Maynooth

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Big Data

  1. 1. Tracey P. Lauriault @TraceyLauriault Tracey.Lauriault@NUIM.ie http://www.maynoothuniversity.ie/progcity/ NIRSA, Maynooth University GY610 Mapping, GIS and Critical Spatial Data Week 10: Session 10: Thursday 2.00 – 4.00 pm Geography Library; Physical Geography Lab Big Data
  2. 2. Plan 1. 4 readings 2. Brainstorm and discuss commonalities and outliers 3. Brainstorm & discuss each paper – definitions, concepts, ideas, conclusions, concerns, dislikes, new ideas... 4. Look at some maps & discuss 5. Do a big data assessment exercise based on Kitchin’s big data definition 6. Introduction to the Programmable City Project
  3. 3. Readings: Mark Graham and Taylor Shelton, 2013, Geography and the future of big data, big data and the future of geography, Dialogues in Human Geography 3:255, available at http://dhg.sagepub.com/content/3/3/255 (5 pages) Rob Kitchin and Tracey P. Lauriault, 2014, Small data in the era of big data, GeoJournal, available at http://link.springer.com/article/10.1007%2Fs10708-014-9601-7 (12 pages) Harvey J. Miller and Michael F. Goodchild, 2014, Data-driven geography, GeoJournal, available at http://link.springer.com/article/10.1007%2Fs10708-014-9602-6 (12 pages) Emma Uprichard, Roger Burrows and Simon Parker, 2009, Geodemographic code and the production of space, Environment and Planning A, Vol. 41:2823-2835, available at http://www.envplan.com/abstract.cgi?id=a41116 (11 pages)
  4. 4. Commonalities
  5. 5. • Black boxed algorithms • Predictive governance / predictive categories / pre crime/ technological agency / data dictatorships / anticipatory governance / Post-hegemonic power – algorithmic! • Digital ghettoization or balkanization / Data rich areas / samples / sorting • Control & Power & humans matter
  6. 6. 3 of the 4 papers mentioned these documents http://archive.wired.com/science/discoveries/maga zine/16-07/pb_theory “There's no reason to cling to our old ways. It's time to ask: What can science learn from Google?” (2008) http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D- Data-Management-Controlling-Data-Volume-Velocity-and- Variety.pdf (2001)
  7. 7. All 4 papers include one of the other of these http://dhg.sagepub.com/content/3/3/262.abstract (2013) http://mitpress.mit.edu/books/codespace (2011)
  8. 8. 1. Graham & Shelton, 2013, Geography and the future of big data, big data and the future of geography
  9. 9. 1. Graham & Shelton, 2013, Geography and the future of big data, big data and the future of geography • Big Data Characteristics • Volume • Velocity • Variety • Transactional? • Effects they engender? • Computational paradigm • Meme – establishment of truth • Big Data View of Authors • Discourses, objects, practices • Views of the world • Measuring, models, algorithms, info systems... • Scientisvistic, positivistic and quantitative turn • Data as facts, validity and objective truth • End of theory? • Actors • Technologists • Journalists • Venture capitalists • Private sector • Geographers? • Concepts • Data shadows • Data and algorithmic governance • Computational approaches • Augmented space • Behavioural profiles • Privacy • Metadata • Predictive categories • Triangulation • Neutrality of databases and algorithms • Black box algorithms • Obfuscation and refraction • Amplified socio-spatial unevenness • Data as depoliticizing tool • Digital ghettoization or balkanization • Openness, trust, transparency
  10. 10. Graham & Shelton, 2013 • Conclusion • Exposed the promises and perils of big data • demonstrated the discursive power of big data as a meme • Opportunity to use big data for social justice, inequality, and relationship with the environment • But, unevenness of representation, limited opportunities for participation, barriers to research, opaqueness, governance issues and privacy are a concern • Who is big data serving?
  11. 11. 2. Kitchin & Lauriault, 2014, Small data in the era of big data
  12. 12. 2. Kitchin & Lauriault, 2014, Small data in the era of big data • Growth • Development of tek, infrastructure, techniques, & processes, • embedded into everyday business, social practices & spaces, • embedded into mobile devices, objects, machines, and systems that are networked, • social media, online interactions, transactions, data analytics • Objects • Traffic systems & web cams • BIMS • Surveillance & policing systems, biometrics • Gov. Dbases • Customer, production & logistic chains • Data enabled & data producing infrastructures • Finance & payment systems • Locative & social media • Algorithmically controlled cameras, sensors, scanners, • smart phones, • clickstreams, • by-product of networks systems • Derived data products • Infrastructure • Catalogues, portals, directories and repositories, archives • Cyberinfrastructure – SDI • standards, protocols and policies • Assemblage • Concepts • Small data • Data rich areas • Big data analytics • Ontological characteristics • Data brokers • Dataveillance • Social sorting • Control creep • Anticipatory governance • Augmented • Monitored • Regulated • Assemblage • Socio-technical systems • Volunteered or crowdsourced • Oligoptic view of the world vs gods eye view • Openly expressed data – swipe cards, sensors • Exhaust – by products • Ecological fallacies • Gamed data • Curated image of the self • Streams of data, garden hose, spritzers, white list • Data storage vs archiving • Data brokers • Abductive, deductive, inductive • Geodemographic segmentation • Black boxed algorithms • Data determinism
  13. 13. Kitchin & Lauriault, 2014 • Issues • Big data become more important than small data • “Small data mine gold from working on a narrow seam, whereas big data studies seek to extract nuggets through open pit mining” • Data quality, fidelity, lineage, objective, authenticity, reliability – big data are so large that these no longer matter • Inexactitude • Open vs closed • Replication & validation • Combining big data with small data • Data free from theory • Lack of hypothesis • Data driven science • Weak surface analysis vs deep penetrating insight • Stigmatization and redlining • Informed consent • Big data are shaped by: • Field of view/ sampling, location of devices, settings/parameters, users • Technology / platform used – produce variance and bias • Context w/in which generated • Data ontology • Regulatory environment • They capture what is easy to ensnare • Data Analytics • Struggle with social & context • Create bigger haystacks • Do not address big issues well • Favours memes over masterpieces • Obscures values
  14. 14. Kitchin & Lauriault • Conclusions • Small data will continue to be vital, big and small data will be complementary, small data are the baseline • Data infrastructures store and disseminate small data • Scaling, linking, joining, combining big and small data • Small data are exposed to epistemologies of data science (e.g., digital humanities) • Small data combined with big data are influencing the growth of data brokers and profiling • Pernicious effects of combining: dataveillance, social sorting, control creep and anticipatory governance impinge on privacy, social freedom and have structural consequences on peoples lives
  15. 15. Comparing Small & Big Data Characteristics Small Data Big Data Attributes of Big Data Volume Limited to large Very large Terabytes and pet bytes Exhaustivity Samples Entire population In scope striving toward entire population and systems n=all Resolution & indexicality Coarse & weak to tight & strong Tight & strong As detailed as possible and uniquely indexical in identification Relationality Weak to strong Strong Common fields to enable co-joining of datasets Velocity Slow, Freeze- framed Fast Real & near-real time Variety Limited to wide Wide Diverse in type, structured and unstructured, maybe temporally and spatially referenced Flexible & Scalable Low to middling High Can easily add to and extend, can expand in size Table compiled by Kitchin from: Boyd & Crawford 2012, Dodge & Kitchin 2005, Marz & Warren 2012, Mayer-Schonberger & Cukier 2013
  16. 16. 3. Miller & Goodchild, 2014, Data-driven geography
  17. 17. 3. Miller & Goodchild, 2014, Data-driven geography • Big Data Characteristics • Volume • Velocity • Veracity • Data capturing technologies • Sensors ground based • Software • Location aware tech • GPS • Mobile phones • Surveillance cameras • In situ sensors – cars, phones, in infrastructure • Remote sensors – airborne and satellite platforms • Radiofrequency • RFID • Georeference social media & crowdsourcing • Def: • Predictions are made by mining data for patterns w/correlation among new data sources and some accurate predictions 4 paradigms in science 1. Empirical science 2. Theoretical science 3. Computational science 4. Data driven science – big data Tensions 1. Theory driven vs data driven 2. Prediction vs discovery 3. Law seeking vs description seeking 4. Evolution vs revolution 5. From question to sample – from sample to question Issues: 1. Population not samples 2. Messy not clean 3. Correlation not causation 4 capabilities of abductive reasoning 1. Ability to posit fragments of theory 2. Massive set of knowledge, common sense to domain expertise 3. Means to search to find connections and patterns and potential explanation 4. Complex problem solving – analogy, approximation and guessing 5. Background kn and interesting measures, formalized kn
  18. 18. Miller & Goodchild, 2014 • Big questions • Are theory and explanation archaic? • Does data velocity matter? • Can lack of QC & rigorous sampling be overcome? • Can we make valid generalizations from serendipitous data collection? • Can big data data-driven methods lead to significant discoveries? • Or will we continue to rely on scarce data (small data)? Sections 1.Theory in data driven geo •correlation supersedes causation, explanation but not laws.mid range theories, general propositions, long terms big space vs short term small space, nomotheic vs idiographic 2.Approaches to data driven geo •knowledge discovery, data exploration and hypothesis generating, abductive, deductive and inductive reasoning •Data-driven modelling – general to specific vs specific to general, predictive performance •Theory may not be possible, data drive the form of the model, complexity, de-skilling 3.Caution with data driven •Formalizing geo kn, spuriousness, truth and understanding, black boxed algorithms, privacy, pre-crime, pre-punishment, data-driven dictatorship Benefits •Spatial temporal dynamics vs snapshots @ multiple scales •Mundane & unplanned phenomena captured •Probable and inconsequential •Improbable but consequential
  19. 19. Miller & Goodchild • Conclusion • Most fundamental changes are variety and velocity in data • Old issues in new clothes – volume, n, messy data, idiographic vs nomothetic kn • Big data can inform both geographic kn discovery and spatial modelling – but need to formalize geog kn to clean data and ignore spurious patterns, and to build true and understandable models • Blackbox of closed systems • Caution on social implications – predictive governance, avoid data dictatorships and humans need to be part of the decision making process
  20. 20. 4. Uprichard, Burrows & Parker, 2009, Geodemographic code and the production of space
  21. 21. 4. Uprichard, Burrows & Parker, 2009, Geodemographic code and the production of space • Geodemographic classifications: • the spaces people occupy says something about the sort of people that live there • Classes are sets of practices • Inscriptions • Embedded in social action and power • Socially produced • Have some social meaning about the subjects, esp. name, useful • Combines national censuses and other data, admin & commercial • Data used are already pre-classed – contingent, historical, political and cognitive • Use of statistical knowledge • credibility • Tools: • PRIZM • Acorn • Mosaic • Concepts • Social spatial vectors / forms • Code/space • Geodemographics as code • Coded space • Technological agency • Algorithmic power • Technological unconscious • Automatic production of space • Software sorted geographies • Ground truth • Urban ecology – socio spatial structure • Ecological determinants • Clusters, types of spaces, sorting • Complexity, contingency, contrivance & desirability • Making hold and being held • Coded classifications • Mechanics of method • Production of reality/space • Ontological properties of the world • Self-organizing, Fractal • Dynamic interaction • Post-hegemonic power – algorithmic! • Translation and transduction of space
  22. 22. Uprichard, Burrows & Parker, 2009 • Big Questions: • How code is instantiated, materialised and constructed via code/space • Reiterative, transformative or recursive practices of technology • How are the code that construct coded spaces constructed • Problematize the contingency in producing spaces on coded classifications • Who is constructing the code for who? • Material outcomes of code • Issues • Making coded space • Which one becomes useful? • Who decides what is and not useful? • Political, and ethical concerns • Social shaping • Entrenchment of categories – normalization • Intrinsic or natural kinds? • Circularity of measurement
  23. 23. Uprichard, Burrows & Parker • Conclusion • If posthegemonic power are algorithmic, and if algorithms are fundamental to the transduction of space, then we need to rethink the analysis of the production of space so that the cultural, social, political and technical construction of code becomes a fundamental part of that process
  24. 24. URLS • http://www.ethanzuckerman.com/blog/2008/1 2/26/mapping-infrastructure-and-flow/ • http://atlas.gcrc.carleton.ca/homelessness/int ro/intro.xml.html • http://sikuatlas.ca/cape_dorset_terminology.ht ml • http://www.floatingsheep.org/ • http://maps.stamen.com/#terrain/12/37.7706 /-122.3782 • http://www.dublindashboard.ie/pages/index
  25. 25. Exercise Characteristics Small Data Big Data Census Sensors Remote Sensing Social Media Other Volume Limited to large Very large Very large • • Exhaustivity Samples Entire pop. all • Crucial • Resolution & indexicality Coarse & weak - tight & strong Tight & Stron g Individual ID • ? Relationality Weak to strong Stron g Name address • ? Velocity Slow, Freeze- framed Fast Decennial quinquen- nial X Crucial • Variety Limited to wide Wide Questions X One stream X Flexible & Scalable Low to middling High Hard to change, fields fixed time X ?
  26. 26. The Programmable City • A European Research Council (ERC) and Science Foundation of Ireland (SFI) funding • SH3: Environment and Society • Led by Dr Rob Kitchin, the Primary Investigator • Based at the National Institute for Regional and Spatial Analysis (NIRSA) • At the National University of Ireland Maynooth (NUIM)
  27. 27. MIT Press 2011 Sage 2014 Aim of the ERC project is to build off and extend a decade of work that culminated in Code/Space book (MIT Press) with a set of detailed empirical studies
  28. 28. Objective • to provide: • an interdisciplinary analysis of the two core inter-related aspects of the emerging programmable city: • (a) Translation: how cities are translated into code, and • (b) Transduction: how code reshapes city life” (Kitchin 2011).
  29. 29. Objectives How is the city translated into software and data? How do software and data reshape the city? Translation: City into Code & Data Transduction: Code & Data Reshape City THE CITYCODE & DATA Discourses, Practices, Knowledge, Models Mediation, Augmentation, Facilitation, Regulation
  30. 30. ProgCity Research Matrix Translation: City into code Transduction: Code reshapes city Understanding the city (Knowledge) How are digital data materially and discursively supported and processed about cities and their citizens? How does software drive public policy development and implementation? Managing the city (Governance) How are discourses and practices of city governance translated into code? How is software used to regulate and govern city life? Working in the city (Production) How is the geography and political economy of software production organised? How does software alter the form and nature of work? Living in the city (Social Politics) How is software discursively produced and legitimated by vested interests? How does software transform the spatiality and spatial behaviour of individuals?
  31. 31. Kitchin’s Data Assemblage Attributes Elements Systems of thought Modes of thinking, philosophies, theories, models, ideologies, rationalities, etc. Forms of knowledge Research texts, manuals, magazines, websites, experience, word of mouth, chat forums, etc. Finance Business models, investment, venture capital, grants, philanthropy, profit, etc. Political economy Policy, tax regimes, public and political opinion, ethical considerations, etc. Govern- mentalities / Legalities Data standards, file formats, system requirements, protocols, regulations, laws, licensing, intellectual property regimes, etc. Materialities & infrastructures Paper/pens, computers, digital devices, sensors, scanners, databases, networks, servers, etc. Practices Techniques, ways of doing, learned behaviours, scientific conventions, etc. Organisations & institutions Archives, corporations, consultants, manufacturers, retailers, government agencies, universities, conferences, clubs and societies, committees and boards, communities of practice, etc. Subjectivities & communities Of data producers, curators, managers, analysts, scientists, politicians, users, citizens, etc. Places Labs, offices, field sites, data centres, server farms, business parks, etc, and their agglomerations Marketplace For data, its derivatives (e.g., text, tables, graphs, maps), analysts, analytic software, interpretations, etc. Systemsofthought
  32. 32. Locations • Dublin (Primary City) • Boston (Secondary City) • Ottawa/Montreal (Open Data Case Studies)
  33. 33. The Dublin Dashboard includes: • real-time information • time-series indicator data • & interactive maps about all aspects of the city Benefits: • detailed, up to date intelligence about the city that aids everyday decision making and fosters evidence-informed analysis. Freely available data sources: • Dublin City Council • Dublinked • Central Statistics Office • Eurostat • government departments • links to a variety of existing applications Produced by: • The Programmable City project • All-Island research Observatory (AIRO) at Maynooth University • working with Dublin City Council Funded by : • the European Research Council (ERC) • Science Foundation Ireland (SFI)
  34. 34. Readings: Mark Graham and Taylor Shelton, 2013, Geography and the future of big data, big data and the future of geography, Dialogues in Human Geography 3:255, available at http://dhg.sagepub.com/content/3/3/255 (5 pages) Rob Kitchin and Tracey P. Lauriault, 2014, Small data in the era of big data, GeoJournal, available at http://link.springer.com/article/10.1007%2Fs10708-014-9601-7 (12 pages) Harvey J. Miller and Michael F. Goodchild, 2014, Data-driven geography, GeoJournal, available at http://link.springer.com/article/10.1007%2Fs10708-014-9602-6 (12 pages) Emma Uprichard, Roger Burrows and Simon Parker, 2009, Geodemographic code and the production of space, Environment and Planning A, Vol. 41:2823-2835, available at http://www.envplan.com/abstract.cgi?id=a41116 (11 pages)

×