Data management logical part of Library research support but unfamiliar territory Safest long-term preservation option not always best for re-use E.g. existing University ICT staff unfamiliar with digital preservation Data-labs save on ingest/acquisition/training costs but not straight forward to publish from data-lab to archive
3TU.Datacentrum Tuesday, June 12th 2012 Jeroen Rombouts
3TU.Datacentrum = …• 3 Dutch TU’s: Delft, Eindhoven, Twente• Project 2008-2011, going concern 2012-• Data archive – 2008 - – “finished” data – preserve but do not forget usability – data citation information (incl. DataCite DOI’s)• Data labs – Started (hosting) – Unfinished data + software/scripts
Website & Data-archive • http://datacentrum.3tu.nl • Information News, announcements Publications, links and tutorials• http://data.3tu.nl• Data sets download and ‘management’• ‘Use’ data with Google Maps/Earth, OPeNDAP, …
Data archiving options• ‘Simple’ sets (Do It Yourself) Standard (self)upload form and descriptive information, single file per object (can be a ‘zipped’ collection), single DOI, … E.g.: Zandvliet, H.J.W. et al. (2010): Diffusion driven concerted motion of surface atoms: Ge on Ge(001). MESA+ Institute For Nanotechnology, University of Twente. doi:10.4121/uuid:3f71549c-6097-4bb8-bc00-6db77deb161d• Special collections (Do It Together) Negotiate: deposit procedure, description (xml, picture, preview), data model, level of DOI assignment, query online, … E.g.: Otto, T., Russchenberg, H.W.J. (2010): IDRA weather radar measurements - all data. TU Delft - Delft University of Technology. doi:10.4121/uuid:5f3bcaa2-a456-4a66-a67b-1eec928cae6d
Meta data ‘publication’• Meta data in DataCite Meta Data Store https://mds.datacite.org• Meta data harvestable (OAI-PMH) (CC0) NARCIS (www.narcis.nl), …?• Crawlable (OAI-ORE linked data) (CC0) PRIMO (soon…)• Open to search engine bots
Training & Data-labs • http://dataintelligence.3tu.nl • Reference, News & Events for training library staff.• OpenEarth, SHARE, …?
Experience• Front office – Being (physically) close helps building trust – Huge ‘disciplinary’ (individual) differences in openness and data management level – Need more than a few (trained) people• Back office – Wide array of skills required (legal, it, management, digital curation, research tools, training, …) – Trade-off between long term preservation and (re-)use – Balancing generic and discipline specific• Data labs – Value for acquisition and standardization
What our accountmanagers ‘sell’…The benefits for data producers and data consumers • Increased visibility of research output. (metadata in repository networks, assigning doi’s, facilitate increases citation rate for ‘enhanced publications’, ...); • Improved quality of dataset (quality assurance for multi- user setup, checks on ingest, …); • Provide (long-term) preservation of and accessibility to, valuable research data; • Distribution of research data for reuse, including administration and usage statistics; • Provides advice on data management, rights, formats, metadata, etc.
What do data producers say? 1/2 Only for long term Datasets are stored by continuous data No time! publisher Our research is once onlyInteresting but not for meNobody needs my data Our datasets are Data transfer not confidential needed, every PhD does own project
What do data producers say? 2/2 Very usefull, essential When can I store metadata often missing my datasets? Much to improve in reuse of dataGood opportunity to share datasets we bought Would like to publish data Surprising our university had no Transfer of data between faciltity for data PhD’s can be improved preservation