Collaboration and sharing
computational research methods
University of Manchester, UK
Scientific research is +
generally held to be of
good provenance when Methods
it is documented in +
detail sufficient to allow
In silico Experimental Standard Operating,
Procedures, Protocols, Plans
E. Science laboris: in silico experiments
Automated, reusable scripted
analysis pipelines - workflows
Data processing , data chaining
Data and tool integration
Simulation steering and parameter sweeps
Model and hypothesis building
Result Validation and comparison
Data cleaning, curation and preservation
Shield from clouds and
Record provenance: steps,
Genetic variation in cattle species.
Food security, biodiversity.
Resistance to African trypanosomiasis infection
Liverpool (Kemp), Manchester (Brass), Nairobi
Comparing new data with reference genomes, prior results
and the literature to identifying interesting differences
22 million SNPs
Little Science +
Bottom up Effectiveness in Research
• Automated, repeatable, tracked plumbing
– Using institutional and community computing
infrastructures, tools and datasets
• Easier access to best of breed and “surfing” results
– non-developers access to sophisticated codes and
applications, shielded from nasty computing details.
• Leverage applications, services, datasets and codes
shielded from computing details.
– Honors original codes and applications. Heterogeneous
coding styles and tools sets. The best applications.
• Extensibility, adaptability & innovation.
– My stuff. Variant design.
Reuse, Recycle, Repurpose, Mash, Trade, Publish
Identify biological pathways implicated in
resistance to Trypanosomiasis in cattle using
mouse as a model organism.
Fisher P, et al Nucleic Acids Research, 2007, 35(16) 5625-5633
Dr Paul Fisher
Dr Jo Pennock
Identify the biological pathways involved in
sex dependence in the mouse
model, believed to be involved in the ability
of mice to expel the whipworm parasite.
Levison S.E., et al Inflammatory Bowel Diseases (2010)
Global Long Tail
How do I find and share
methods across the
the web ?
How do I connect with
other authors and users?
How do I know if its any
good or right for me?
Who else is using it?
Where do I comment on
• Socially share, discover, review and reuse
workflows and other scientific methods.
• Cooperative market place.
• A scientific gateway.
• Commons-based Production + Social
• Primary contribution, reviewing and curating.
Find experts and
peers, advice, work
Train and educate
Launch workflows Cloud Methods Commons
Facts and Figures: Boutique but Beautiful
• Public Service: 1325 workflows, 349 files, 138 packs, 4129 registered
members, 235 groups, 56 different countries, ~ 3000 unique hits per
month. Workflows viewed/downloaded many 1000s of times.
• Adopted by 19 workflow systems and integrated into workflow
workbenches: Galaxy, Taverna
– Biology, chemistry, image analysis, social science, astronomy, engineering,
– Specialised clones in Music & NeuroScience.
– Focus of research on workflow patterns and analytics.
• JISC funding since 2007
• (Other funding: Microsoft, EU, EPSRC, BBSRC)
Effectiveness and Open Collaboration
Open platform, off the shelf components, open
development, open linked data, Web 3.0 funky
Linked Data Cloud
Backup and Archive Friends, colleagues, resources
[Duncan Hull] Data (files, spreadsheets)
Workflow management system
Collaboration, acceleration and
transparency through Automated
Social collaboration environments
Collaboration, acceleration and
transparency through Human
Adoption of Reproducible Methods
New Publishing and Learning Objects
Pre-and Post Publication Metadata Differentials
Citation, Credit and Reputation
Methods Matter Science 2010
(or at least
Actionable scholarly Environment
publishing & Undergraduate
Conference Preprints Reports
Web Data, Metadata
Results & Analyses Ontologies
Data and Method
The rise of the
Sharing Governance ….
“Its not ready yet”
“I need to get (another) publication first”
“We don’t have the resources or skills to prepare
it for others, esp. now we finished that project”
“Others won’t use it properly.” “Its not worth
“Its faster/easier to do it myself, and will get
the credit/control too”
“Its not described enough to be usable”
“I don’t trust the quality. Its not reliable enough. Its
Credit and Reputation Community building
T Shirts are not enough
QUALITY for REUSE
Social & Cultural
and curated Content
Computer science research
Computational researchers Methodology
Social Science Social experiment
Free like puppies
Take Home: Methods Matter.
• Workflows are a transformative mechanism of
connecting tools and encoding know-how
– Scientists stand on the shoulders of resource experts
• myExperiment is a example of a collaborative
environment for connecting workflow authors and users
– Authors stand on the shoulders of each other
• The Power of Collectivism.
• Rewards and risks of researchers in competitive research.
• Cultural shift in reward, adoption and support for
building, sharing and curating computational methods.
myExperiment Director: David De Roure
Developers Users Sponsors
• Jiten Bhagat • Katy Wolstencroft • Savas Parastatidis
• Don Cruickshank • Paul Fisher • Roger Barga
• Danius Michaelides • Duncan Hull • Derick Campbell
• David Newman • Franck Tanoh • Tony Hey
• Sergejs Aleksejevs • Andrea Wiggins
• Mark Borkum • Marco Roos
• Matt Lee • Jerzy Orlowski
• Tom Foster • Olga Krebs
• Wolfgang Mueller
Allied, Contributing Projects • Tony Linde
• Thomas Laurent
• Eric Nzuobontane Social Scientists
• Ian Dunlop • Yuiwei Lin
• Stuart Owen • Rob Proctor
• Shoaib Sufi • Meik Poschen
• Sean Bechhofer • Jonathan Foster
• Rodrigo Lopez
• Steve Pettifer
• Mannie Tags
• Finn Bacall
• Sarah Thew
• Matt Gamble
• Tim Clark
David De Roure