Culturomic analyses study millions of books at once. (A) Top row: Authors have been writing for millennia; ~129 million book editions have been published since the advent of the printing press (upper left). Second row: Libraries and publishing houses provide books to Google for scanning (middle left). Over 15 million books have been digitized. Third row: Each book is associated with metadata. Five million books are chosen for computational analysis (bottom left). Bottom row: A culturomic time line shows the frequency of “apple” in English books over time (1800–2000). (B) Usage frequency of “slavery”. The Civil War (1861–1865) and the civil rights movement (1955–1968) are highlighted in red. The number in the upper left (1e-4 = 10–4) is the unit of frequency. (C) Usage frequency over time for “the Great War” (blue), “World War I” (green), and “World War II” (red).
Point out dis-intermediation / re-intermediation aspects of online distribution / dominance by Google
The Internet, Science, and Transformations of Knowledge (Ralph Schroeder)
The Internet, Science, and Transformations of Knowledge TITLE Ralph Schroeder Oxford Internet Institute, University of Oxford May 3, 2012
Overview• Definition of e-Research• The sociology of advancing (online) knowledge• Examples and Cases• Implications
Research computingSupercomputing The Grid Web 2.0 Clouds Big Data
e-ResearchDefined as distributed and collaborative digital tools and data for knowledge production,
Digital transformations of research Computational Manipulability + Research Technologies (Mathematization) Research Front (For different fields) Socio-Technical Organization (Computerization movements)
A Model of Transformations Computational manipulability+ Research technologies+ Socio-technical organization= Transformations of research front
Computational Manipulability?• ‘the distinctiveness of the network of mathematical practitioners is that they focus their attention on the pure, contentless form of human communicative operations: on the gestures of marking items as equivalent and of ordering them in series, and on the higher-order operations which reflexively investigate the combinations of such operations’• ‘mathematical rapid-discovery science…the lineage of techniques for manipulating formal symbols representing classes of communicative operations’
Research Technologies and Driving Forces• Off-the-shelf and special purpose, but ‘all- purpose’ (passport-like) machines across contexts• A hard core around which researchers can focus attention on a common research front• Movements (SIMs, Frickel and Gross) to computerize (mathematize?) research (Kling)• Core (research technologies) plus organization and movements - driving science (and research)
The sociology of advancing (online) knowledge production• Research instruments plus mathematics -> high-consensus rapid-discovery science• Orientation to a community of researchers at the research front• Focus of attention limited by law of small numbers (Collins)• The extension of computation into research• The limits of understanding and explaining research-in-the-making… …versus a movement that applies across research
Varieties of Research• Humanities: patterns in words, numbers, images, sounds…• Social Sciences: statistics, image analysis, mapping…• Sciences: Hacking’s ‘styles’• Mathematization, now Cloudified• All knowledge is digitally manipublable in e- Research…• …but relation of the object to the (physical) world or to the research front varies
Examples and Cases– GAIN = statistical data pooling– Galaxyzoo = taxonomic crowdsourcing– Integrative Biology = modelling– EGEE/LHC = observation and measurement– SPLASH = taxonomic– Swedish National Data Service = statistical, combined data– SwissBioGrid = statistical/modelling– VOSON = statistical, network analysis– PynchonWiki = interpretive crowdsourcing– Cultural genomics with Google Books = statistical/interpretive– Moretti = distance reading via network analysis...what type of transformation?
Particle Physics and EGEE: The world’s largest e-Science collaborationSource: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
SPLASH: Structure of Populations, Levels of Abundance, and Status of HumpbacksMeyer, E.T. (2009). Moving from small science to big science: Social and organizational impediments to largescale data sharing. In Jankowski, N. (Ed.), E-Research: Transformation in Scholarly Practice (RoutledgeAdvances in Research Methods series). New York: Routledge.
e-Research in Sweden• Sweden has a major e-Research initiative• ’Universal’ personal identification• Uniquely powerful datasets (e.g. twin registry)• Significance: If Swedes can’t do it, no one can?• Use of population data in a ’transparent’ society with high trust between people, authorities and researchers…• …but, implementation of secure distributed access and ’incidents’ creating public concerns• Swedish National Data Service
VOSON (NodeXL version)Ackland, R. (2010), "WWW Hyperlink Networks," Chapter 12 in D. Hansen, B. Shneiderman and M. Smith (eds), Analyzing Social Media Networks with NodeXL: Insights from a connected world. Morgan-Kaufmann.
Fig. 1 Culturomic analyses study millions of books at once. J Michel et al. Science 2011;331:176-182Published by AAAS
Source: Moretti, F. (2011). Network Theory, Plot Analysis. New Left Review 68, p. 81
Source: Meyer, E.T., Schroeder, R. (2009). Untangling the Web of e-Research: Towards a Sociology of Online Knowledge. Journal ofInformetrics 3(3):246-260
iTunes U Google Citations Microsoft Academic Search Twitter YouTube …Source: Meyer & Schroeder (2009). The World Wide Web of Research and Access to Knowledge. Journal of Knowledge Management Researchand Practice 7 (3):218-233.
What difference does it make?– A physical core network of digital tools and data (computational manipulability)– A research community focuses its efforts– The expandable (‘clouds’) capacity of research instruments + new organizational modes = ongoing diffusion of e-Research across domains– Limits of this spread = limits of attention on new fronts towards which there are orientations: ‘advances’ versus existing directions
Changing Research Practices• Communication: searchability/findability, and (pressure for) increased reflexivity• Role of Knowledge in society: boundaries vis-a-vis public and between research communities becomes more porous• Knowledge: driven towards computational manipulability and aggregatability• The confluence of these three: Research becomes an increasingly autonomized apparatus in society and a complexified socio- technical one
Implications• Implications for Science Communication: – Reflexivity changes practices, and the role of knowledge vis-à-vis public• Implications for STS, information science and other fields: – synthesis beyond existing (sub) disciplinary boundaries is needed• Implications for policy and practice: – awareness of positive and negative aspects of autonomization (or intermediation and disintermediation of knowledge) – changing boundaries within knowledge, and between knowledge and society
Oxford e-Social Science Project Oxford Oxford Institute for Internet e-Research Science, Innovation Institute Centre and Society at Saïd Business Schoolhttp://www.oii.ox.ac.uk/microsites/oess/