Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

John morrissey c3 dis fair working data.pptx


Published on

FAIR - Working Data - It's not just about FAIR publishing. Presented by John Morrissey from CSIRO at the C3DIS post conference workshop: Managed data – trusted research: an introduction to Research Data Management 31 may 2018 in Melbourne

Published in: Education
  • Be the first to comment

  • Be the first to like this

John morrissey c3 dis fair working data.pptx

  1. 1. FAIR - Working Data It’s not just about FAIR publishing CSIRO IMT John Morrissey 31st May 2018
  2. 2. Are you being FAIR to the future you? In 5 years time will my research data be: • Findable – A top draw of USB drives and sticks isn’t always a good data archive • Accessible – My new desktop doesn't have a DVD drive or what was the password on that encrypted data drive? • Interoperable – Wonder where I put my old copy of that software that compiles this binary data file? • Reusable – How accurate was that sensor network I used to gather these observations? Am I allowed to reuse this data? FAIR - Working Data | John Morrissey2 |
  3. 3. Let’s start with a simple data map 3 | Data Inputs • Reference data sets • Sensor data • Public data streams • Lab Instrumentation • Survey data • Observational data • …… Digital Processing • Data cleansing • Where will we process/share the data? • Modelling • Analysis • Machine learning • AI • Linking data sets • …… Digital Outputs • New data sets • Software • Additional reference data • Visualisations • Organisational data repositories (private data) • …… FAIR - Working Data | John Morrissey
  4. 4. Case Study: Materials science CSIRO FAIR - Working Data | John Morrissey4 | Data sources: • Materials data • Equipment capabilities data • Process settings/configurations • Process monitoring sensors • Test equipment capabilities data • Test equipment data output • Sample tracking data • Lab Notebooks (paper)
  5. 5. Activity Using FAIR principals for the CSIRO materials science case study what data management capabilities would you invest in to: 1. Improve data sharing between CSIRO and the University 2. Support the joint capture of critical data assets to support the IP management goals of both partners in the collaboration 3. Create a data management capability that minimizes the rework required to publish project data to a global materials science community and industry FAIR - Working Data | John Morrissey5 |
  6. 6. FAIR Working Data Findable by whom? How? Minimum viable metadata? • Standardized naming conventions for folders and files • Consider using Readme.txt files to describe content? Maybe you could include metadata.txt or metadata.json files embedded in folders • Think about what persistent identifiers are useful in your project. • Do you need a basic registry to manage metadata? FAIR - Working Data | John Morrissey6 |
  7. 7. FAIR Working Data Accessible by: Whom? How? What? • How will you manage identity and access control? • Shared storage resources – where? • Will you use simple storage or a higher level platform like a shared eLab notebook or database? • What categories of data will you hold/share and which data assets need to be kept long term? FAIR - Working Data | John Morrissey7 |
  8. 8. FAIR Working Data Interoperable: • What are the key standards currently applied to the projects domain/s? • Are my data producing assets standards compliant? Do they need to be? What do I have to do to convert my data assets to the correct format? • Do we have a set of vocabularies we want to use within our project? Where are they? • Who can help me with my standards compliance work? (Librarians? IT Specialists? Information Management Specialists?) FAIR - Working Data | John Morrissey8 |
  9. 9. FAIR Working Data Reusable: • Agree on a licencing framework before the project starts producing data • What data assets need to be preserved long term? • What data assets will we publish? • Where will we publish? • Who has contributed to the data asset and how will they be represented when the data published • Who will manage the long-term data archive? FAIR - Working Data | John Morrissey9 |
  10. 10. Our current recommendations for the Materials Science project • Complete a pre-project data audit • Transition from a paper based lab notebook to a ELN provisioned in the Cloud (Possibly LabArchives or similar) • Dedicate a storage appliance to managing all the data streams for the lab – this can be also provisioned in the cloud if required • Agree on a file/folder naming convention that allocates a directory tree to each instrument • Create registries/databases for equipment, people and samples using globally recognized identifiers like ORCID for people FAIR - Working Data | John Morrissey10 |
  11. 11. FAIR - Working Data | John Morrissey11 | Something to ponder…. “The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom.” Isaac Asimov