The Internet, Science, and Transformations of
                  Knowledge
                          TITLE
            Ralph Schroeder & Eric T. Meyer
      Oxford Internet Institute, University of Oxford




                          2012



                                                        @etmeyer
The OeSS Project                        2005-2012




              Oxford e-Social Science Project




               Oxford       Oxford           Institute for
              Internet    e-Research     Science, Innovation
              Institute     Centre           and Society
                                                   at
                                         Saïd Business School

http://www.oii.ox.ac.uk/microsites/oess/
   research using
   digital tools and data
   for the distributed and collaborative
   production of knowledge
Research computing
Supercomputing


         The Grid


                    Web 2.0


                              Clouds


                                       Big Data
Digital transformations of research

    Computational
    Manipulability +
 Research Technologies
  (Mathematization)

                          Transformations of
                            Research Front
                         (For different fields)

   Socio-Technical
    Organization
  (Computerization
    movements)
Computational Manipulability?
• ‘the distinctiveness of the network of mathematical
  practitioners is that they focus their attention on the pure,
  contentless form of human communicative operations: on
  the gestures of marking items as equivalent and of ordering
  them in series, and on the higher-order operations which
  reflexively investigate the combinations of such operations’

• ‘mathematical rapid-discovery science…the lineage of
  techniques for manipulating formal symbols representing
  classes of communicative operations’
Research Technologies and Driving
                Forces
• Off-the-shelf and special purpose, but ‘all-
  purpose’ (passport-like) machines across contexts
• A hard core around which researchers can focus
  attention on a common research front
• Movements (SIMs, Frickel and Gross) to
  computerize (mathematize?) research (Kling)
• Core (research technologies) plus organization
  and movements - driving science (and research)
The sociology of advancing (online) knowledge
                   production
• Research instruments plus mathematics ->
  high-consensus rapid-discovery science
• Orientation to a community of researchers at the
  research front
• Focus of attention limited by law of small
  numbers (Collins)
• The extension of computation into research
• The limits of understanding and explaining
  research-in-the-making…
  …versus a movement that applies across research
Varieties of Research
• Humanities: patterns in words, numbers, images,
  sounds…
• Social Sciences: statistics, image analysis, mapping…
• Sciences: Hacking’s ‘styles’
• Mathematization, now Cloudified
• All knowledge is digitally manipublable in e-
  Research…
• …but relation of the object to the (physical) world or
  to the research front varies
“   I get pretty much everything I need by
    way of primary sources now from the
    web. For primary sources, I’ve now got
    more material than I will need probably
    for the rest of my lifetime.
Asking new questions?


“   My greatest frustration in life is that we
    can now answer all the questions we had
    in 1980 faster, much, much faster. And we
    can get around to publishing them much,
    much more quickly. But what we haven’t
    yet done is develop the new questions and
    the new paradigms that should be
    possible, and that we as imaginative
    scholars should be able to imagine.
Particle Physics and EGEE: The world’s largest e-Science collaboration




Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
Citizen e-Science: Distribute computation
Citizen e-Science: Distribute brainpower




NASA Clickworkers (ca. 2000)
GAIN:
Genetic Association
Information Network
Years        Type of study       Samples   DNA Sequencing     Scope of collaboration
1985-1997   Family association /      300 Hundreds of loci /       4 sites in USA
                  linkage                  candidate genes
1997-2007   Family association /     1,500    10,000 SNPs          13 sites in USA
                  linkage
2007-2009     Genome-wide            5,000   1,200,000 SNPs        Multiple multi-
               association                                            institution
                                                                collaborations in USA
 2010-?       Whole genome         30,000    Millions of SNPs        World-wide
                                                                    collaboration
 Future       Whole genome               ?   Entire genome           World-wide
               sequencing                      sequence             collaboration
SPLASH: Structure of
      Populations, Levels of
      Abundance, and Status of
      Humpbacks




Meyer, E.T. (2009). Moving from small science to big science: Social and organizational impediments to large
scale data sharing. In Jankowski, N. (Ed.), E-Research: Transformation in Scholarly Practice (Routledge
Advances in Research Methods series). New York: Routledge.
Humpback whales




                  19
20
e-Research in Sweden
• Sweden has a major e-Research initiative
• ’Universal’ personal identification
• Uniquely powerful datasets (e.g. twin registry)
• Significance: If Swedes can’t do it, no one can?
• Use of population data in a ’transparent’ society with high trust between
  people, authorities and researchers…
• …but, implementation of secure distributed access and ’incidents’
  creating public concerns


• Swedish National Data Service
Swiss BioGrid
Novartis
Weisenburger vs. the Wiki on Pynchon




                         Comparison of book and wiki annotation efforts
                                                                         Entries
                                                                        (topical
                                              Size                   + alphabetical
              Annotation                 (no. of words)             + page-by-page)                Contributors
       Book Form Annotation:
          Weisenburger’s                    162000                         904                         1 (22)
         Gravity’s Rainbow
                                                                             120
           Wiki: Against the
                                            455057                        + 1358                        235
                  Day
                                                                          + 4067
Source: Schroeder, R., & Besten, M. D. (2008). Literary Sleuths Online: e-Research collaboration on the Pynchon Wiki.
Information, Communication & Society, 11(2), 167 - 187.
Fig. 1 Culturomic analyses study millions of books at once.




         J Michel et al. Science 2011;331:176-182



Published by AAAS
Source: Moretti, F. (2011). Network Theory, Plot Analysis. New Left Review 68, p. 81
Browsing and Searching: Humanities
                                                                   79% Google

                                                                   66% Google Scholar

    Libraries                                                      59% Visit the library
                                                                   55% Browse library materials online
                                                                   62% Search library materials online
                                                                   83% Citation chaining



    Journals                                                       48% Browse printed journals
                                                                   76% Browse online journals

    Peers
                                                                   95% Consult peers and experts
Report available at http://www.rin.ac.uk/humanities-case-studies
0%   10% 20% 30% 40% 50% 60% 70% 80% 90%
  Physical Sciences
                                                           Google                                                 83%
                 Browsing or reading online journals                                                         78%
                                            Peers or experts                                                 78%
Searching databases (e.g. Web of Science, arXiv)                                                           72%
                                           Citation chaining                                               72%
Browsing databases (e.g. Web of Science, arXiv)                                                      63%
                                                         Students                              39%
                                      Notification services           TITLE                   37%
                                             Google Scholar                                   36%
                                                        Email lists                           36%
                    Browsing library materials online                                        33%
                   Browsing or reading print journals                                   29%
                         Keyword searches of journals                                   29%
                                                             Wikis                     26%
                                           Web 2.0 services                            25%
              Keyword searches of library materials                              16%
                Browsing library materials in person                             14%
                                                        RSS Feeds               12%
                                       Social network sites                7%                                    n=76
Report available at: http://ssrn.com/abstract=1991753
n=76


Important Information Resources
Google
         Particle Physics                                                100%


    Gamma Ray Burst                                      71%


         Nuclear Physics                                         87%


              Chemistry                                           90%


           Earth Science                                 73%


           Nanoscience                                                   100%


             Zooniverse                            63%

                            0%   20%   40%   60%           80%         100%
n=76
     Google or Google Scholar
     as1 st or 2 nd most Important strategy

                     0%         20%            40%            60%             80%            100%

                                         30%
    Nanoscience                                                    60%
                                                                                 80%
                                               36%
   Earth Science                        27%
                                                             55%
                                                        50%
  Particle Physics    0%
                                                        50%
                                                 40%
 Nuclear Physics           7%
                                                       47%
                                  21%
Gamma Ray Burst       0%
                                  21%
                                  20%
       Chemistry      0%
                                  20%
                      0%                                            Google
      Zooniverse      0%
                      0%                                            Google Scholar
                                                                    Either Google or Google Scholar
Digital as a dirty word



“   I do feel pressure to work more with originals
    than with the digital images, but for the most
    part I do feel like I get more out of using these
    images on my computer. But there’s a certain
    pressure that that’s not what top scholars do
    because that’s not what top scholars did
    25 years ago
What difference does it make?
– A physical core network of digital tools and data
  (computational manipulability)
– A research community focuses its efforts
– The expandable (‘clouds’) capacity of research
  instruments + new organizational modes
  = ongoing diffusion of e-Research across domains
– Limits of this spread = limits of attention on new
  fronts towards which there are orientations:
  ‘advances’ versus existing directions
Research Technologies and Driving Forces

 • Off-the-shelf and special purpose, but ‘all-purpose’
   (passport-like) machines across contexts
 • A hard core around which researchers can focus attention
   on a common research front
 • Movements (SIMs, Frickel and Gross) to computerize
   (mathematize?) research (Kling)
 • Core (research technologies) plus organization and
   movements - driving science (and research)
The sociology of advancing (online)
knowledge production
• Research instruments plus mathematics -> high-
  consensus rapid-discovery science
• Orientation to a community of researchers at the
  research front
• Focus of attention limited by law of small
  numbers (Collins)
• The extension of computation into research
• The limits of understanding and explaining
  research-in-the-making…
  …versus a movement that applies across research
Oxford Internet Institute

           Ralph Schroeder                             Eric T. Meyer
      ralph.schroeder@oii.ox.ac.uk                eric.meyer@oii.ox.ac.uk
http://www.oii.ox.ac.uk/people/?id=120    http://www.oii.ox.ac.uk/people/?id=120
           With support from:

The Internet, Science, and Transformations of Knowledge

  • 1.
    The Internet, Science,and Transformations of Knowledge TITLE Ralph Schroeder & Eric T. Meyer Oxford Internet Institute, University of Oxford 2012 @etmeyer
  • 2.
    The OeSS Project 2005-2012 Oxford e-Social Science Project Oxford Oxford Institute for Internet e-Research Science, Innovation Institute Centre and Society at Saïd Business School http://www.oii.ox.ac.uk/microsites/oess/
  • 3.
    research using  digital tools and data  for the distributed and collaborative  production of knowledge
  • 4.
    Research computing Supercomputing The Grid Web 2.0 Clouds Big Data
  • 5.
    Digital transformations ofresearch Computational Manipulability + Research Technologies (Mathematization) Transformations of Research Front (For different fields) Socio-Technical Organization (Computerization movements)
  • 6.
    Computational Manipulability? • ‘thedistinctiveness of the network of mathematical practitioners is that they focus their attention on the pure, contentless form of human communicative operations: on the gestures of marking items as equivalent and of ordering them in series, and on the higher-order operations which reflexively investigate the combinations of such operations’ • ‘mathematical rapid-discovery science…the lineage of techniques for manipulating formal symbols representing classes of communicative operations’
  • 7.
    Research Technologies andDriving Forces • Off-the-shelf and special purpose, but ‘all- purpose’ (passport-like) machines across contexts • A hard core around which researchers can focus attention on a common research front • Movements (SIMs, Frickel and Gross) to computerize (mathematize?) research (Kling) • Core (research technologies) plus organization and movements - driving science (and research)
  • 8.
    The sociology ofadvancing (online) knowledge production • Research instruments plus mathematics -> high-consensus rapid-discovery science • Orientation to a community of researchers at the research front • Focus of attention limited by law of small numbers (Collins) • The extension of computation into research • The limits of understanding and explaining research-in-the-making… …versus a movement that applies across research
  • 9.
    Varieties of Research •Humanities: patterns in words, numbers, images, sounds… • Social Sciences: statistics, image analysis, mapping… • Sciences: Hacking’s ‘styles’ • Mathematization, now Cloudified • All knowledge is digitally manipublable in e- Research… • …but relation of the object to the (physical) world or to the research front varies
  • 10.
    I get pretty much everything I need by way of primary sources now from the web. For primary sources, I’ve now got more material than I will need probably for the rest of my lifetime.
  • 11.
    Asking new questions? “ My greatest frustration in life is that we can now answer all the questions we had in 1980 faster, much, much faster. And we can get around to publishing them much, much more quickly. But what we haven’t yet done is develop the new questions and the new paradigms that should be possible, and that we as imaginative scholars should be able to imagine.
  • 12.
    Particle Physics andEGEE: The world’s largest e-Science collaboration Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
  • 13.
  • 14.
    Citizen e-Science: Distributebrainpower NASA Clickworkers (ca. 2000)
  • 16.
  • 17.
    Years Type of study Samples DNA Sequencing Scope of collaboration 1985-1997 Family association / 300 Hundreds of loci / 4 sites in USA linkage candidate genes 1997-2007 Family association / 1,500 10,000 SNPs 13 sites in USA linkage 2007-2009 Genome-wide 5,000 1,200,000 SNPs Multiple multi- association institution collaborations in USA 2010-? Whole genome 30,000 Millions of SNPs World-wide collaboration Future Whole genome ? Entire genome World-wide sequencing sequence collaboration
  • 18.
    SPLASH: Structure of Populations, Levels of Abundance, and Status of Humpbacks Meyer, E.T. (2009). Moving from small science to big science: Social and organizational impediments to large scale data sharing. In Jankowski, N. (Ed.), E-Research: Transformation in Scholarly Practice (Routledge Advances in Research Methods series). New York: Routledge.
  • 19.
  • 20.
  • 22.
    e-Research in Sweden •Sweden has a major e-Research initiative • ’Universal’ personal identification • Uniquely powerful datasets (e.g. twin registry) • Significance: If Swedes can’t do it, no one can? • Use of population data in a ’transparent’ society with high trust between people, authorities and researchers… • …but, implementation of secure distributed access and ’incidents’ creating public concerns • Swedish National Data Service
  • 23.
  • 24.
    Weisenburger vs. theWiki on Pynchon Comparison of book and wiki annotation efforts Entries (topical Size + alphabetical Annotation (no. of words) + page-by-page) Contributors Book Form Annotation: Weisenburger’s 162000 904 1 (22) Gravity’s Rainbow 120 Wiki: Against the 455057 + 1358 235 Day + 4067 Source: Schroeder, R., & Besten, M. D. (2008). Literary Sleuths Online: e-Research collaboration on the Pynchon Wiki. Information, Communication & Society, 11(2), 167 - 187.
  • 25.
    Fig. 1 Culturomicanalyses study millions of books at once. J Michel et al. Science 2011;331:176-182 Published by AAAS
  • 26.
    Source: Moretti, F.(2011). Network Theory, Plot Analysis. New Left Review 68, p. 81
  • 27.
    Browsing and Searching:Humanities 79% Google 66% Google Scholar Libraries 59% Visit the library 55% Browse library materials online 62% Search library materials online 83% Citation chaining Journals 48% Browse printed journals 76% Browse online journals Peers 95% Consult peers and experts Report available at http://www.rin.ac.uk/humanities-case-studies
  • 28.
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Physical Sciences Google 83% Browsing or reading online journals 78% Peers or experts 78% Searching databases (e.g. Web of Science, arXiv) 72% Citation chaining 72% Browsing databases (e.g. Web of Science, arXiv) 63% Students 39% Notification services TITLE 37% Google Scholar 36% Email lists 36% Browsing library materials online 33% Browsing or reading print journals 29% Keyword searches of journals 29% Wikis 26% Web 2.0 services 25% Keyword searches of library materials 16% Browsing library materials in person 14% RSS Feeds 12% Social network sites 7% n=76 Report available at: http://ssrn.com/abstract=1991753
  • 29.
    n=76 Important Information Resources Google Particle Physics 100% Gamma Ray Burst 71% Nuclear Physics 87% Chemistry 90% Earth Science 73% Nanoscience 100% Zooniverse 63% 0% 20% 40% 60% 80% 100%
  • 30.
    n=76 Google or Google Scholar as1 st or 2 nd most Important strategy 0% 20% 40% 60% 80% 100% 30% Nanoscience 60% 80% 36% Earth Science 27% 55% 50% Particle Physics 0% 50% 40% Nuclear Physics 7% 47% 21% Gamma Ray Burst 0% 21% 20% Chemistry 0% 20% 0% Google Zooniverse 0% 0% Google Scholar Either Google or Google Scholar
  • 31.
    Digital as adirty word “ I do feel pressure to work more with originals than with the digital images, but for the most part I do feel like I get more out of using these images on my computer. But there’s a certain pressure that that’s not what top scholars do because that’s not what top scholars did 25 years ago
  • 32.
    What difference doesit make? – A physical core network of digital tools and data (computational manipulability) – A research community focuses its efforts – The expandable (‘clouds’) capacity of research instruments + new organizational modes = ongoing diffusion of e-Research across domains – Limits of this spread = limits of attention on new fronts towards which there are orientations: ‘advances’ versus existing directions
  • 33.
    Research Technologies andDriving Forces • Off-the-shelf and special purpose, but ‘all-purpose’ (passport-like) machines across contexts • A hard core around which researchers can focus attention on a common research front • Movements (SIMs, Frickel and Gross) to computerize (mathematize?) research (Kling) • Core (research technologies) plus organization and movements - driving science (and research)
  • 34.
    The sociology ofadvancing (online) knowledge production • Research instruments plus mathematics -> high- consensus rapid-discovery science • Orientation to a community of researchers at the research front • Focus of attention limited by law of small numbers (Collins) • The extension of computation into research • The limits of understanding and explaining research-in-the-making… …versus a movement that applies across research
  • 35.
    Oxford Internet Institute Ralph Schroeder Eric T. Meyer ralph.schroeder@oii.ox.ac.uk eric.meyer@oii.ox.ac.uk http://www.oii.ox.ac.uk/people/?id=120 http://www.oii.ox.ac.uk/people/?id=120 With support from: