Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Why we need PIDs for Structural Biology - EOSC Symposium, Budapest, 2019

283 views

Published on

Persistent Identifiers in Structural Biology use case presented to European Open Science Cloud PID Policy breakout session.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Why we need PIDs for Structural Biology - EOSC Symposium, Budapest, 2019

  1. 1. Why we need PIDs for Structural Biology Marcus Povey EOSC Symposium, Budapest November 2019
  2. 2. Instruct-ERIC • Instruct-ERIC helps facilitate access to cutting edge research infrastructure within the domain of Structural Biology • We have centres all around Europe • 13 member countries, with 2 observers • Funded through direct member country contribution at the ministerial level
  3. 3. Instruct on the ESFRI Roadmap
  4. 4. Our Mission • ACCESS – Facilitating access to cutting edge research infrastructure and methods • FACILITY – Helping research infrastructures manage their equipment, and representing their interests • COMMUNITY – Contributing to the wider scientific community as a whole, and helping researchers, projects and infrastructures work better together • DATA – Improving access to research data, and facilitating Open Access • ARIA Cloud! DATA FACILITY ACCESS COMMUNITY
  5. 5. Structural Biologists use Microscopes • 12 Samples (grids) in a loader • Each grid can potentially have multiple structures that are of interest (projects with 96 well grids are underway) • Outputs ~1-3TB of HD Video per day
  6. 6. Electron Microscopy Researcher submits a proposal for access Researcher produces a sample locally Sample is loaded onto a grid Grid goes into Electron Microscope Micrographs go into pre-processor Particle picking, auto & manual processing Datasets are analysed by 10s of software packages 3D structure determined Structure deposited into PDB/EM-DB Researcher submits a publication to journal
  7. 7. There are a lot of things to track… • Number of sample grids • Potentially multiplied by samples on a grid • Multiplied by grids in a microscope • Multiplied by frames of video • Multiplied by number of microscopes per facility • … multiplied by the number of facilities.
  8. 8. ... But wait, there’s more! • We need to know the data processing workflows used • We need to identify samples and associated metadata • We need to know a given machine’s configuration • Software and software versions used to process and analyse data • Researchers involved in project • Funding applications (proposals)
  9. 9. Structural Biologists also use Synchrotrons… • Similarly large data volumes • Similarly complex machine configurations • Similarly complex data processing workflow to produce results
  10. 10. Crystallography Researcher submits a proposal for access Researcher produces a sample locally Sample added to crystal plate Crystal plate imaged regularly Crystals loaded onto pins Crystals shot with X- Rays at synchrotron Diffraction pattern auto-analysis and re- running 3D structure determined Structure deposited into PDB Researcher submits a publication to journal
  11. 11. Why Identify? – Improving workflows • Different samples need to prepared in different ways • Not all experiments are successful • Do-overs are expensive!
  12. 12. Why Identify? - FAIR • Want to be FAIR! (improve findability, interoperability and reproducibility of data sets) • Commitment to Open Access • But… Data sets are too large to practically move about • Machine configurations are often only available on the machine itself • Software gets modified • How do we make this findable, accessible and reusable?
  13. 13. Some Problems to Consider • Will require minting of large quantities of PIDs • … In near real time • Metadata schema around existing PIDs seem focussed around publications • ... But we’d need to extend (and make it machine readable) • … ditto how best to link / graph data together • Some facilities have rolled their own solutions, how to cooperate?
  14. 14. @ARIA_access aria@instruct-eric.eu Thanks!

×