Knowledge Infrastructure for
     Global Systems Science
                     David De Roure
m o de
         ls
              na r
                     rati
                          ves
                                data
                                       scie
                                              ntis
                                                     ts   citize
                                                                   ns
Social Objects
www.myexperiment.org
 A repository for sharing
  research methods (e.g.
  data analysis pipelines)
 Largest public workflow
  repository (2600 workflows
  for multiple systems, 320
  groups, 280 packs)
 Workflows have co-evolved
  into Packs, Research Objects
  and now Computational
  Research Objects
 Influenced BioCatalogue,
  MethodBox and SysMO-SEEK
methods




 data
Co-Evolution of Research Objects
                   Packs




                                        ORE
                                        OAI
       Workflows




                           Research Objects




                                              W3C PROV
 Computational
Research Objects
Sean Bechhofer




                                   SELECT?wf ?uri
                                   SELECT?wf ?uri
SELECT?pack ?contrib               WHERE {{
                                   WHERE
SELECT?pack ?contrib
WHERE {                             ?wf mebase:has-current-version ?v.
                                     ?wf mebase:has-current-version ?v.
WHERE {
 ?pack rdf:type mepack:Pack.        ?v mecomp:executes-dataflow ?d.
                                     ?v mecomp:executes-dataflow ?d.
  ?pack rdf:type mepack:Pack.
 ?pack ore:aggregates ?contrib.     ?d mecomp:has-component ?c.
                                     ?d mecomp:has-component ?c.
  ?pack ore:aggregates ?contrib.
}                                   ?c rdf:type mecomp:WSDLProcessor.
                                     ?c rdf:type mecomp:WSDLProcessor.
}
                                    ?c mecomp:processor-uri ?uri.
                                     ?c mecomp:processor-uri ?uri.
                                   }
                                   }
http://force11.org/
…after the inputs have been completed, the
model will run, and as the resulting output
becomes available to view in the eBook, the
navigation tree changes to reflect this…




                                   http://www.bristol.ac.uk/cmm/research/estat/
www.methodbox.org
The R dimensions
Reusable. The key tenet of                         Replayable. Studies might involve
Research Objects is to support the                 single investigations that happen in
sharing and reuse of data, methods                 milliseconds or protracted processes
and processes.                                     that take years.
Repurposeable. Reuse may also                      Referenceable. If research objects
involve the reuse of constituent                   are to augment or replace traditional
parts of the Research Object.                      publication methods, then they must
Repeatable. There should be                        be referenceable or citeable.
sufficient information in a Research               Revealable. Third parties must be
Object to be able to repeat the                    able to audit the steps performed in
study, perhaps years later.                        the research in order to be convinced
Reproducible. A third party can                    of the validity of results.
start with the same inputs and        Respectful. Explicit representations
methods and see if a prior result can of the provenance, lineage and flow
be confirmed.                         of intellectual property.
  Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
Research
 repeat      Record           repeat

Machine
Machine       paper
               paper         Machine
                             Machine

                            REPRODUCE




              paper
               paper
software
 software                    software
                              software
Machine
Machine                       Machine
                              Machine
             Software
              Software
                            REPRODUCE OR REPEAT?



              paper
               paper
workflow
 workflow                     workflow
                              workflow
                wf
                wf            software
                               software
software
 software
Machine
Machine      Software
              Software         Machine
                               Machine
                         blogs.nature.com/eresearch/
Notifications and automatic re-runs
                      Executable theses
       Autonomic
        Curation
                    New research?
       and repair



  Computational Research Objects



Machines are users too
A Big Picture
                cyber/e-infrastructure


                Big Data                 The Future?
More machines




                Big Compute

                Conventional             Social
                                                       online
                Computation              Networking    research




                               More people
Some Social Machines




sociam.org
Discussion points

1. What are the social objects of Global
   Systems Science?
  –   Models, data, narratives, …
1. How do we achieve automation that is
   assistive and scales?
  –   Machines are users too: computer assistance versus
      automation/Taylorisation
  –   Computational research objects
1. Social Machines for Systems Science
  –   Theory and practice, design and construction
  –   Science on, of and in the Web
Data ∪ Models ∪ Expertise
                               “sense-making network”


Datasets                or
(+ models)
(searched by experts)




Iain Buchan
david.deroure@oerc.ox.ac.uk
www.oerc.ox.ac.uk/people/dder
www.scilogs.com/eresearch
@dder
http://www.myexperiment.org/packs/346
Links
        •   myExperiment project wiki
            http://wiki.myexperiment.org/
        •   Workflow Forever project (Wf4Ever)
            http://www.wf4ever-project.org/
        •   Digital Social Research
            http://www.digitalsocialresearch.net/
        •   e-Stat
            http://www.bristol.ac.uk/cmm/research/estat/
        •   Methodbox
            http://www.methodbox.org/
        •   Theory and Practice of Social Machines (SOCIAM)
            http://sociam.org/
        •   Web Science
            http://webscience.org/
        •   Future of Research Communication (FORCE11)
            http://force11.org/
•   D. De Roure, C. Goble and R. Stevens. The Design and Realisation of the myExperiment
    Virtual Research Environment for Social Sharing of Workflows Future Generation
    Computer Systems 25, pp. 561-567.
•   S. Bechhofer, I. Buchan, D De Roure et al. Why linked data is not enough for scientists,
    Future Generation Computer Systems
•   D. De Roure, David and C. Goble, Anchors in Shifting Sand: the Primacy of Method in
    the Web of Data. WebSci10, April 26-27th, 2010, Raleigh, NC, US.
•   D. De Roure, S. Bechhofer, C. Goble and D. Newman, Scientific Social Objects, 1st
    International Workshop on Social Object Networks (SocialObjects 2011).
•   D. De Roure, K. Belhajjame, P. Missier, P. et al Towards the preservation of scientific
    workflows. 8th International Conference on Preservation of Digital Objects (iPRES 2011).
•   Carole A. Goble, David De Roure and Sean Bechhofer Accelerating scientists’
    knowledge turns. Will be available at www.springerlink.com
•   Khalid Belhajjame, Oscar Corcho, Daniel Garijo et al Workflow-Centric Research
    Objects: First Class Citizens in Scholarly Discourse, SePublica2012 at ESWC2012,
    Greece, May 2012
•   Kevin R. Page, Ben Fields, David De Roure et al Reuse, Remix, Repeat: The Workflows of
    MIR, 13th International Society for Music Information Retrieval Conference (ISMIR
    2012) Porto, Portugal, October 8th-12th, 2012

Knowledge Infrastructure for Global Systems Science

  • 1.
    Knowledge Infrastructure for Global Systems Science David De Roure m o de ls na r rati ves data scie ntis ts citize ns
  • 3.
  • 4.
    www.myexperiment.org  A repositoryfor sharing research methods (e.g. data analysis pipelines)  Largest public workflow repository (2600 workflows for multiple systems, 320 groups, 280 packs)  Workflows have co-evolved into Packs, Research Objects and now Computational Research Objects  Influenced BioCatalogue, MethodBox and SysMO-SEEK
  • 5.
  • 6.
    Co-Evolution of ResearchObjects Packs ORE OAI Workflows Research Objects W3C PROV Computational Research Objects
  • 7.
    Sean Bechhofer SELECT?wf ?uri SELECT?wf ?uri SELECT?pack ?contrib WHERE {{ WHERE SELECT?pack ?contrib WHERE { ?wf mebase:has-current-version ?v. ?wf mebase:has-current-version ?v. WHERE { ?pack rdf:type mepack:Pack. ?v mecomp:executes-dataflow ?d. ?v mecomp:executes-dataflow ?d. ?pack rdf:type mepack:Pack. ?pack ore:aggregates ?contrib. ?d mecomp:has-component ?c. ?d mecomp:has-component ?c. ?pack ore:aggregates ?contrib. } ?c rdf:type mecomp:WSDLProcessor. ?c rdf:type mecomp:WSDLProcessor. } ?c mecomp:processor-uri ?uri. ?c mecomp:processor-uri ?uri. } }
  • 8.
  • 9.
    …after the inputshave been completed, the model will run, and as the resulting output becomes available to view in the eBook, the navigation tree changes to reflect this… http://www.bristol.ac.uk/cmm/research/estat/
  • 10.
  • 11.
    The R dimensions Reusable.The key tenet of Replayable. Studies might involve Research Objects is to support the single investigations that happen in sharing and reuse of data, methods milliseconds or protracted processes and processes. that take years. Repurposeable. Reuse may also Referenceable. If research objects involve the reuse of constituent are to augment or replace traditional parts of the Research Object. publication methods, then they must Repeatable. There should be be referenceable or citeable. sufficient information in a Research Revealable. Third parties must be Object to be able to repeat the able to audit the steps performed in study, perhaps years later. the research in order to be convinced Reproducible. A third party can of the validity of results. start with the same inputs and Respectful. Explicit representations methods and see if a prior result can of the provenance, lineage and flow be confirmed. of intellectual property. Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
  • 12.
    Research repeat Record repeat Machine Machine paper paper Machine Machine REPRODUCE paper paper software software software software Machine Machine Machine Machine Software Software REPRODUCE OR REPEAT? paper paper workflow workflow workflow workflow wf wf software software software software Machine Machine Software Software Machine Machine blogs.nature.com/eresearch/
  • 13.
    Notifications and automaticre-runs Executable theses Autonomic Curation New research? and repair Computational Research Objects Machines are users too
  • 14.
    A Big Picture cyber/e-infrastructure Big Data The Future? More machines Big Compute Conventional Social online Computation Networking research More people
  • 15.
  • 16.
    Discussion points 1. Whatare the social objects of Global Systems Science? – Models, data, narratives, … 1. How do we achieve automation that is assistive and scales? – Machines are users too: computer assistance versus automation/Taylorisation – Computational research objects 1. Social Machines for Systems Science – Theory and practice, design and construction – Science on, of and in the Web
  • 17.
    Data ∪ Models∪ Expertise “sense-making network” Datasets or (+ models) (searched by experts) Iain Buchan
  • 18.
  • 19.
    Links • myExperiment project wiki http://wiki.myexperiment.org/ • Workflow Forever project (Wf4Ever) http://www.wf4ever-project.org/ • Digital Social Research http://www.digitalsocialresearch.net/ • e-Stat http://www.bristol.ac.uk/cmm/research/estat/ • Methodbox http://www.methodbox.org/ • Theory and Practice of Social Machines (SOCIAM) http://sociam.org/ • Web Science http://webscience.org/ • Future of Research Communication (FORCE11) http://force11.org/
  • 20.
    D. De Roure, C. Goble and R. Stevens. The Design and Realisation of the myExperiment Virtual Research Environment for Social Sharing of Workflows Future Generation Computer Systems 25, pp. 561-567. • S. Bechhofer, I. Buchan, D De Roure et al. Why linked data is not enough for scientists, Future Generation Computer Systems • D. De Roure, David and C. Goble, Anchors in Shifting Sand: the Primacy of Method in the Web of Data. WebSci10, April 26-27th, 2010, Raleigh, NC, US. • D. De Roure, S. Bechhofer, C. Goble and D. Newman, Scientific Social Objects, 1st International Workshop on Social Object Networks (SocialObjects 2011). • D. De Roure, K. Belhajjame, P. Missier, P. et al Towards the preservation of scientific workflows. 8th International Conference on Preservation of Digital Objects (iPRES 2011). • Carole A. Goble, David De Roure and Sean Bechhofer Accelerating scientists’ knowledge turns. Will be available at www.springerlink.com • Khalid Belhajjame, Oscar Corcho, Daniel Garijo et al Workflow-Centric Research Objects: First Class Citizens in Scholarly Discourse, SePublica2012 at ESWC2012, Greece, May 2012 • Kevin R. Page, Ben Fields, David De Roure et al Reuse, Remix, Repeat: The Workflows of MIR, 13th International Society for Music Information Retrieval Conference (ISMIR 2012) Porto, Portugal, October 8th-12th, 2012