2. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
3. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
8. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
9. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
12. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
13. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
14.
15. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
16. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
18. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
19. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
20. Store.Synchrotron.org.au
● Store.Synchrotron is a service that captures all macromolecular beamline data, available
online to all non-commercial Australian Synchrotron users. It was developed by Monash
University in a strategic, ongoing partnership.
● Data is immediately shareable by the researcher on the web and able to be published.
● The service operates on the Australian NeCTAR Research compute cloud in a scalable
setup able to withstand load and large fully redundant RDSI (VicNode) storage.
● We’re actively opening access to raw data behind high-impact research publications
under CC BY licenses. Six institutions have opened data so far.
● Built on MyTardis – an open source, Australian made data management.
● Visit store.synchrotron.org.au for access
24. Real-time instrument data capture
Capture began June 2013. As of July 2014, it has captured over 31
terabytes of data in over 2.4 million raw diffraction images.
Source: http://bdp-aaf-dev.dyndns.org/graphtime.html
25.
26.
27. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
28. Research Data Management is Painful
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
29. Case Study: Monash Micro Imaging
Users organise data into username/dataset folders on the instrument
control computers and data is automatically retrieved by MyTardis.
Over 15 Microscopes at Monash Micro Imaging are integrated.
This is being done for gene sequencers (UoM), nanofabrication (RMIT), MRI
(UQ) and more.
30. Case Study: RMIT Cloud HPC Provider
Processing On
NeCTAR Cloud
HRMC Web App
Results and
Analysis in
MyTardis
● Complex high performance
computing both on cloud
infrastructure and queue-based
systems
● Simple web interface to start
complex MapReduce runs with
parameter sweeps (Hybrid
Reverse Monte Carlo
simulations)
● Results come back with
analysis: graphs auto-generated
31. Manual Curation
Upload data from anywhere..
Button and Drag’n’Drop Code
(github.com/steveandroulakis/mytardis-uploader)
33. Research Data Management is possible
● Time-consuming to store, describe and organise data
● Manual effort required to store and describe data
● Poor storage media
● Costly storage
● Hard to share data with collaborators (especially external to institution)
● Hard to access data
● Hard to open data to the public and cite it in a journal
34. Coming Soon..
● Deploy a scalable MyTardis on a cloud (like OpenStack or Amazon EC2) in a few
commands (using SaltStack).
● Mount MyTardis on the file system to browse and access your data.
● An instrument integration app is being produced with a double-click installer for simple
Instrument data to MyTardis (works on Windows!).
● Go to mytardis.org for more news and information.
● Thanks!
● Contact Me: Steve Androulakis (Bioinformatics Manager), steve.androulakis@monash.
edu