1. Scientific Social Objects
David De Roure, Sean Bechhofer, Carole Goble, David Newman
sean.bechhofer@manchester.ac.uk
@seanbechhofer
http://humblyreport.wordpress.com
1st International Workshop on Social Object Networks (SocialObjects 2011),
Boston, October 9th 2011.
1
2. E. Science laboris
• Workflows are the new rock and roll!
• Machinery for coordinating the
execution of services and linking
together resources
• Scientist friendly (for some class
of scientists)
• Repetitive and mundane boring stuff
made easier
• Enable automation
• Make science repeatable (and
sometimes reproducible)
• Encourage best practices
• Shareable
Carole
Goble
3. Reuse, Recycle, Repurpose
• Paul writes workflows for identifying biological
pathways implicated in resistance to
Trypanosomiasis in cattle
• Paul meets Jo. Jo is investigating Whipworm in
mouse.
• Jo reuses one of Paul’s workflow without change.
• Jo identifies the biological pathways involved in
sex dependence in the mouse model, believed to
be involved in the ability of mice to expel the
parasite.
• Previously a manual two year study by Jo had
failed to do this.
Carole
Goble
4. Carole Goble e-Science is me-Science:
What do Scientists want? EGEE, 2006
“There are these great
collaboration tools that
12-year-olds are using. It’s
all back to front.”
Robert Stevens
5. A sharing platform for scientists
Distinctive features supporting
credit and attribution
A repository of research
methods
Open source (BSD) Ruby on Rails
app
A community social network of
people and things
REST and SPARQL interfaces,
supports Linked Data
A Social Virtual Research
Part of product family including
Environment
BioCatalogue, MethodBox and
A probe into researcher SysmoDB
behaviour
~4700
members,
270
groups,
~2000
workflows,
~200packs
12. myExperiment For Developers
XML
facebook
iGoogle
android
HTML
API
config
SPARQL endpoint
Managed REST API
tags
ratings
reviews
profiles
Search
workflows
credits
groups
Engine
packs
friendships
files
`
RDF
Store
mySQL
Enactor
13. SPARQL endpoint
SPARQL endpoint
rdf.myexperiment.org
Transform
tags
ratings
reviews
profiles
workflows
credits
groups
files
packs
friendships
RDF Store
Modularised
myExperiment
mySQL myExperiment
data
model
Ontology
(evolving!)
DC,
FOAF,
SIOC
(Seman8cally-‐Interlinked
Online
Communi8es)
14. SPARQL endpoint
It is effectively a generic API whereby the user can
specify exactly what information they want to send and
what they expect back -- rather than providing query/
access mechanism via specific API functions. In some
ways it has the versatility of querying the myExperiment
database directly, but with the significant benefit of a
common data model which is independent of the
codebase, and through use of OWL and RDF it is
immediately interoperable with available tooling.
Exposing data in this way is an example of the cooperate
don't control principle of Web 2.0.
Use of existing vocabularies (FOAF, SIOC etc) allows for
mashup/integration with other sources.
Packs can also link to external resources. Links out and Links in
to to the “Linked Data Cloud”.
14
22. Research Objects: Beyond the Pack
• Argumentation: Convince the reader of the
validity of a position [Mesirov]
– Reproducible Results System: facilitates enactment
and publication of reproducible research.
J. Mesirov Accessible Reproducible Research Science 327(5964), p.415-416, 2010
http://dx.doi.org/10.1126/science.1179653
• Results are reinforced by reproducability [De Roure]
– Explicit representation of method.
D. De Roure and C. Goble Anchors in Shifting Sand: the
Primacy of Method in the Web of Data Web Science Conference 2010, Raleigh
NC, 2010 http://eprints.ecs.soton.ac.uk/20817/
• Verifiability as a key factor in scientific discovery.
Stodden et. al. Reproducible Research: Addressing the Need for Data and
Code Sharing in Computational Science Computing in Science and Engineering 12
(5), p.8-13, 2010 http://dx.doi.org/10.1109/MCSE.2010.113
25. Research Objects
• Aggregations intended to foster Reuse, Repurposing and
Repeatability of investigations
• A generalisation of the pack
• Rich social interactions
– Sharable
– Citeable
– Credit and Attribution
– Provenance
2
5
25
26. Wf4Ever
…technological infrastructure for the preservation and
efficient retrieval and reuse of scientific workflows in a range
of disciplines.
• Architecture/implementation for workflow preservation,
sharing and reuse
• Research Object models
• Workflow Decay, Integrity and Authenticity
• Workflow Evolution and Recommendation
• Provenance
• Driven by Use Cases
FP7 Digital Libraries and Digital Preservation
iSOCO, University of Manchester, Universidad Politécnica de
Madrid, University of Oxford, Poznan Supercomputing and
Networking Centre, Instituto de Astrofísica de Andalucía,
Leiden University Medical Centre
26
27. Astronomers Questions
When accessing a workflow
When sharing a workflow
• Can I use it for my purposes (in my • What rights others have?
words)?
• What a good workflow is to get a
• If I can expect it to run, when was good score?
it was last run, by whom?
– Make my workflow findable, reusable,
and ready for review
• What it does quickly, by one of
– Instructions to authors
– example input / output (and trying it)
– Two types of contributions: serious
– a description
science, preliminary/playing around
– ‘reading’ its key parts
• If my workflow may have issues
– what it was used for
– What the system or other users think
– related workflows its creator
it does
– contacting the creator or last user
• How it relates to other things
• How I need to cite the author and
workflow?
• Share freely or anonymously upon
request?
27
http://www.flickr.com/photos/-bast-/349497988/
28. Other Work
Jia Zhang Wei Tan, John Alexander, Ian Foster, Ravi Madduri, Recommend-As-You-
Go: A Novel Approach Supporting Services-Oriented Scientific
Workflow Reuse, Proceedings of the 2011 IEEE International Conference on Services
Computing, 2011 http://dx.doi.org/10.1109/SCC.2011.120
Wei Tan, Jia Zhang, Ian Foster Network Analysis of Scientific Workflows: A
Gateway to Reuse
IEEE Computer 43(9) pp54-61 http://dx.doi.org/10.1109/MC.2010.262
• Volume of data is not enough, but additional consideration
of content could help us.
• Approach also being considered for recommendation in
Wf4Ever.
Julia Stoyanovich, Ben Taskar, and Susan Davidson Exploring Repositories of
Scientific Workflows WANDS 2010 http://wands2010.doc.ic.ac.uk/
• Organising workflows into categories. Now available as
“Topics” tab.
28
29. Wrap up
• myExperiment as a platform for sharing
– Contributables, Annotations, Users
– APIs RDF/SPARQL endpoints
– Come and play!
• Workflows (and their constituent parts) as social objects
• Various networks layered on those objects
• Compositional nature of the objects
– Workflows combining services
– Packs combining objects
• Research Object as a future vision for composite Scientific
Social Objects
29
30. Thanks!
• myExperimentTeam
– http://www.myexperiment.org/
• Wf4Ever Team
– http://www.wf4ever-project.org/
• Manchester Information Management Group
– http://img.cs.manchester.ac.uk
30