SEEK for Science:
A Data Management Platform
to support Open and
Reproducible Science
Professor Carole Goble
The Universit...
Hypothesis
Generation
Public Data
Acquisition
Experiment
and Data
Generation
Public Data
Acquisition
Model
Analysis
Biolog...
Sponsors and Motivation
• BMBF “Großprojekt“
• ~45 organisations
• ~70 groups
• multiscale rep of the liver
• multiscale d...
Funders
• Preserve results beyond projects.
• Organise & link data, models,
processes.
• Exchange & search initiative‘s
as...
People
• Dynamic distributed groups of
experimentalists and modellers
• Cherished own home-grown
and unstable data solutio...
Content
• Locally hosted private repositories
• Public archives
• From single-cell to human
• Samples, Specimens, Standard...
Cataloguing
7
Find my peers
Creating and sharing
SOPs across projects
Track my
specimens
yellow pages, manage SOPs and lin...
The Web-based SEEK Platform
Ruby on RAILS 3.2, BSD,
https://bitbucket.org/seek4science/seek
https://seek.sysmo-db.org/mode...
Data
Models
Articles
External
Databases
http://www.seek4science.org
Metadata
http://www.isatools.org
Aggregated Asset Infr...
simulate models
project mgt,
access control
reporting, citation
governance &
policies
yellow pages
of peers
projects,
expe...
Yellow Pages InstitutionsProjectsPeople
ISA
Investigation
Study
Assay
Asset Catalogue
Models
Datafiles
SOPs
Publications
T...
• Gateway plugin framework
– Tight and loose coupling
– RAILS plugin or bundled GEM
• Metadata framework
– JERM and ISA
• ...
Data….
• Public and new data
• Factors studied
– Linked -> SABIO-RK and ChEBI
• Samples and Specimens
– Extends EBI/NCBI B...
Cytoscape
Repositories
• Biomodels, JWS Online,
local SEEK
JWS Online Simulator
• SBML support
• Auto generation of SBGN
s...
Models
Exchange
Experiment Data
Exchange
Exchange
Exchange
Verification
Comparison
Just
Enough
Results
Model
ISA-TAB
SBML
...
Standards, Structure, Interlink
Construction Validation
Metabolomics
Metabolomics
Mass
SpecTranscriptomics
Proteomics
Flux...
Just Enough
Results Model
Describes and
enriches the
relationships
between things
produced and
used in
experiments.
http:/...
metadata sheets
sample sheets
data sheets
indexes
http://rightfield.org.uk/
Just Enough
Results Model
Describes and
enrich...
Different types of data
Plugins to registered
data repositories
Extract and
auto-
catalogue
metadata
Define
relationships,...
Sys Bio Research Objects
portable packaged research
Adobe UCF
Research Object
Bundle
ORE PROVODF
• Aggregation
• Annotatio...
Reproducible (Open?) Research
Data sharing,
openness and
careers
incentive
See Titus and Phil talks
Open Research: Research Groups & Lifecycles
• Sharing policy
• Visibility, Downloadability
• Fine grained permissions
• Pr...
Open Source
Customisable Platform
https://bitbucket.org/seek4science/seek
Vrije Universiteit, Amsterdam
Systems
Science fo...
Open Source
Customisable Platform
https://bitbucket.org/seek4science/seek
Open Facility for European Systems
Biology data & model management
seeded by EU programmes
• Platform
– SEEK + openBIS + n...
Open Facility for European Systems
Biology data & model management
seeded by EU programmes
• Community
– workshops, user a...
Carole Goble
Stuart
Owen
Jacky Snoep
Wolfgang
Mueller
Olga Krebs Quyen Nguyen
Natalie
Stanfor
d
Katy WolstencrofPeter Kuns...
SEEK for Science: A Data and Model Management Platform to support Open and Reproducible Science in Systems Biology
SEEK for Science: A Data and Model Management Platform to support Open and Reproducible Science in Systems Biology
SEEK for Science: A Data and Model Management Platform to support Open and Reproducible Science in Systems Biology
Upcoming SlideShare
Loading in...5
×

SEEK for Science: A Data and Model Management Platform to support Open and Reproducible Science in Systems Biology

506

Published on

SEEK for Science: A Data and Model Management Platform to support Open and Reproducible Science in Systems Biology

Published in: Software
1 Comment
1 Like
Statistics
Notes
  • Presented at BOSC 2014 (Boston, MA), talk video online at http://video.open-bio.org/video/29/seek-for-science-a-data-management-platform-whic or https://www.youtube.com/watch?v=g8b98kJwT60&feature=youtu.be
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
506
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
17
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "SEEK for Science: A Data and Model Management Platform to support Open and Reproducible Science in Systems Biology"

  1. 1. SEEK for Science: A Data Management Platform to support Open and Reproducible Science Professor Carole Goble The University of Manchester UK BOSC 2014, 12th July 2014
  2. 2. Hypothesis Generation Public Data Acquisition Experiment and Data Generation Public Data Acquisition Model Analysis Biological insight Biological insight Experiment Analysis Model Validation Model Construction Public Data Acquisition Public Data Acquisition Modelling Experimental Systems Biology
  3. 3. Sponsors and Motivation • BMBF “Großprojekt“ • ~45 organisations • ~70 groups • multiscale rep of the liver • multiscale data, models • imaging data • EU ERANet programme • 122 organisations • 16 multi-inst. consortia • independent projects in a two-round funding initiative
  4. 4. Funders • Preserve results beyond projects. • Organise & link data, models, processes. • Exchange & search initiative‘s assets. • Share & disseminate results • Improve standard curation practice. • Pool capacities. • Handle home-brewed solutions with mixed resourcing and no access
  5. 5. People • Dynamic distributed groups of experimentalists and modellers • Cherished own home-grown and unstable data solutions – wikis, CMS, databases, spreadsheets, files. • Access & visibility control over shared content
  6. 6. Content • Locally hosted private repositories • Public archives • From single-cell to human • Samples, Specimens, Standard Op Procedures • Small Data: Reactome…: files, spreadsheets • Big Data: NGS, Mass Spec…: Specialist repositories, files • Models: ODE, SBML, Native Matlab, PDE, Multi-scale • In progress: versioning, track provenance and parameters • Published: citation, links to publications
  7. 7. Cataloguing 7 Find my peers Creating and sharing SOPs across projects Track my specimens yellow pages, manage SOPs and link them to investigations, studies, assays, specimens and samples Browse experimental data without downloading them How data, models and SOPs fit together Which data belong to which publication Data viewing functionality ISA: Link Studies to their data, models, SOPs, samples, publications Track different versions of my model
  8. 8. The Web-based SEEK Platform Ruby on RAILS 3.2, BSD, https://bitbucket.org/seek4science/seek https://seek.sysmo-db.org/models/114 http://www.seek4science.org
  9. 9. Data Models Articles External Databases http://www.seek4science.org Metadata http://www.isatools.org Aggregated Asset Infrastructure…. share and interlinking multi-stewarded, mixed, methods, models, data, samples… A Commons….
  10. 10. simulate models project mgt, access control reporting, citation governance & policies yellow pages of peers projects, experts catalogue, link and index data, models, samples, specimens, sops, experiments, publications using standards curate & annotate data and models using standards with compliance tools incorporate public data and model repositories & tools deposition manage, store and exchange different types and scales of data Reproducibility Score Card integrate local and project tools and data systems scaled-out collection & analytics using third party platforms differentiate construction, validation & predicted data
  11. 11. Yellow Pages InstitutionsProjectsPeople ISA Investigation Study Assay Asset Catalogue Models Datafiles SOPs Publications TagsVersions Access Privileges PresentationsEvents Datafiles Models SOPs JERMExtract,Harvest,Index APIsandLinks BioModels CheBI BioPortal PubMed JWS Online GEO SABIO-RK Web Interface REST API Local SEEK Wikis CMS Own DB Direct Upload Project DM External SEEK OpenBIS
  12. 12. • Gateway plugin framework – Tight and loose coupling – RAILS plugin or bundled GEM • Metadata framework – JERM and ISA • Different instances – Single query across all model repositories – One click deposition BioModels Plug-in, Play nice, Don’t reinvent
  13. 13. Data…. • Public and new data • Factors studied – Linked -> SABIO-RK and ChEBI • Samples and Specimens – Extends EBI/NCBI BioSamples • Treatment Extraction • Tagging with vocabularies • Spreadsheet-based data-view • Big Data – Upload and by email, Share by trusted link, Link to external repository • Access – DOIs and Temp links for reviews
  14. 14. Cytoscape Repositories • Biomodels, JWS Online, local SEEK JWS Online Simulator • SBML support • Auto generation of SBGN schemas for user models • SED-ML export DataFuse • Link and compare construction and validation data with models • Run models with parameter values from spreadsheets Models….
  15. 15. Models Exchange Experiment Data Exchange Exchange Exchange Verification Comparison Just Enough Results Model ISA-TAB SBML MIRIAM SBGN SemanticSBML CellML Construction Prediction MIBBI Standards OBO Controlled Vocabularies SED-ML Simulation Experiment Description Markup Language Standard Formats and Vocabularies
  16. 16. Standards, Structure, Interlink Construction Validation Metabolomics Metabolomics Mass SpecTranscriptomics Proteomics Fluxomics Investigations Studies Assays Towards Interoperable Bioscience Data, Nature Genetics, 2012 Assays
  17. 17. Just Enough Results Model Describes and enriches the relationships between things produced and used in experiments. http://bioportal.bioontology.org/ontologies/JERM reuse community ontologies, markups, mim, identifiers
  18. 18. metadata sheets sample sheets data sheets indexes http://rightfield.org.uk/ Just Enough Results Model Describes and enriches the relationships between things produced and used in experiments. http://bioportal.bioontology.org/ontologies/JERM reuse community ontologies, markups, mim, identifiers
  19. 19. Different types of data Plugins to registered data repositories Extract and auto- catalogue metadata Define relationships, cross-link, aggregate, query standard based templates non-standard templates Open Modelling Exchange Format archive
  20. 20. Sys Bio Research Objects portable packaged research Adobe UCF Research Object Bundle ORE PROVODF • Aggregation • Annotations/provenance • Ad-hoc domain-specific specification OMEX archive Systems Biology: A common archive format for reuse across tools http://www.researchobject.org
  21. 21. Reproducible (Open?) Research Data sharing, openness and careers incentive See Titus and Phil talks
  22. 22. Open Research: Research Groups & Lifecycles • Sharing policy • Visibility, Downloadability • Fine grained permissions • Protocols for – Management transfer – Visibility feedback and sharing workflows – Publication data deposition in external public stores – Batch publishing Within Project Versions Retractions Across Projects Versions Public Final version No Retraction Manager Owner Gatekeeper
  23. 23. Open Source Customisable Platform https://bitbucket.org/seek4science/seek Vrije Universiteit, Amsterdam Systems Science for Health (SSfH) MACS Yeast Glycolysis
  24. 24. Open Source Customisable Platform https://bitbucket.org/seek4science/seek
  25. 25. Open Facility for European Systems Biology data & model management seeded by EU programmes • Platform – SEEK + openBIS + new features & styling • Resource – EuroSEEK + pool of community resources (including established SEEKs). – Independent researchers. Secure data. • Facility – Curation & support services, training http://fair-dom.org/
  26. 26. Open Facility for European Systems Biology data & model management seeded by EU programmes • Community – workshops, user and developer forums, knowledge network, standards & policy, training, FAIRDOM Foundation, Model Carpentry. • Sys Bio Developers Foundry workshop 6-7 October Heidelberg http://fair-dom.org/wiki/Foundry_workshop • RI – working with other EU RIs, an EU network of national facilities, funding models. http://fair-dom.org/
  27. 27. Carole Goble Stuart Owen Jacky Snoep Wolfgang Mueller Olga Krebs Quyen Nguyen Natalie Stanfor d Katy WolstencrofPeter Kunszt Bernd Rinn also contributing VLN SEEK team also contributing UK SEEK team
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×