Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RDMRose 2.5 Metadata and data citation

306 views

Published on

Metadata and data citation. Session 2.5 of the RDMRose v3 materials.
The JISC funded RDMRose project (June 2012-May 2013) was a collaboration between the libraries of the University of Leeds, Sheffield and York, with the Information School at Sheffield to provide an Open Educational Resource for information professionals on Research Data Management. The materials were revised between November 2014 and February 2015 for the consortium of North West Academic Libraries (NoWAL).
http://www.sheffield.ac.uk/is/research/projects/rdmrose

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

RDMRose 2.5 Metadata and data citation

  1. 1. Metadata and data citation May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose Research Data Management Workshop 2.5
  2. 2. Learning Outcomes By the end of this session you will be able to • Discuss the varying requirements of metadata that will enable researchers to identify the potential of a particular dataset • Evaluate ways of citing data • Articulate and reflect upon some of the issues involved with citing data and datasets May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  3. 3. Session 2.5 overview • EPSRC principles and expectations • What is sufficient metadata? • How to cite data? May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  4. 4. EPSRC Principle 6 • “Sufficient metadata should be recorded and made openly available to enable other researchers to understand the potential for further research and re-use of the data. Published results should always include information on how to access the supporting data.” http://www.epsrc.ac.uk/about/standards/researchdata/Pages/principles. aspx May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  5. 5. EPSRC Expectation 5 • “Research organisations will ensure that appropriately structured metadata describing the research data they hold is published (normally within 12 months of the data being generated) and made freely accessible on the internet; in each case the metadata must be sufficient to allow others to understand what research data exists, why, when and how it was generated, and how to access it. Where the research data referred to in the metadata is a digital object it is expected that the metadata will include use of a robust digital object identifier (For example as available through the DataCite organisation - http://datacite.org).” http://www.epsrc.ac.uk/about/standards/researchdata/Pages/exp ectations.aspx May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  6. 6. Activity 1: Metadata • What is “sufficient metadata” that enables “other researchers to understand the potential for further research and re-use of the data”? May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  7. 7. Activity 1: Metadata The University of Poppleton holds a dataset with meteorological observations, taken at the university’s weather station. In particular, it contains a set of precipitation measurements since the foundation of the university. A climatologist, Jenny Fairweather, is interested in this dataset for her research into climate change. She is looking for trends in the weather. A meteorologist, Wilson Rainbird, who works for the UK Met Office wants to use these data for the purposes of weather prediction. He is mainly interested in combining these precipitation measurements with other similar datasets. A researcher, Alice Snowe, from another university’s Accident Research Unit conducts most of her research in the area of road traffic accidents. She would like to map the precipitation measurements to another dataset containing information on road accidents in order to analyse possible correlations. Lastly, the university’s data repository manager, John Shower, is concerned with issues regarding data access and IPR. May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  8. 8. Activity 1: Metadata • What is “sufficient metadata” for each of these stakeholders “to understand the potential for further research and re-use of the data”? May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  9. 9. Example • The DaMaRO project at the University of Oxford has developed a metadata schema for its DataFinder (Rumsey, 2012). • A three-tier metadata approach: – Mandatory minimal metadata to enable basic discovery, such as Creator, Title, Publisher, Date, Location, Access terms & conditions – Mandatory contextual metadata (mostly administrative and partly based on EPSRC expectations), such as Funding Agency, Grant Number, Last access request date, Project Information, Data Generation Process, Why the data was generated, Date (range) of data collection, Reasons for embargo – Optional metadata (including discipline-specific metadata) to enable reuse, such as machine settings and experimental conditions under which the data were gathered May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  10. 10. Activity 2: Data citation • How should data be cited? • There are no established standards for data citation yet, although some style manuals such as the APA’s (in the 5th and 6th editions) and some repositories such as the UK Data Archive do provide instructions. May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  11. 11. Activity 2: Data citation • Researcher, Alice Snowe, from another university’s Accident Research Unit is seeking to use the dataset with precipitation measurements going back to the foundation of the University. This dataset was deposited in 2011 by the University’s meteorologist, Christopher Oldman Frost, and covers all years up to and including 2010. It consists of data subsets that are organised per year, each consisting of several files, including Excel spreadsheets, Word files, and image files (digitised observations written down on paper). Of course, Mr Oldman Frost is not the only meteorologist who has been involved in taking the measurements that make up this dataset. May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  12. 12. Activity 2: Data citation • Alice Snowe is now writing a research paper for Science called ‘The correlation between bicycle accidents and precipitation in urban centres during the rush hour’. She needs to cite our institutional repository’s dataset. In particular she will need to refer to the precipitation measurements of 4 May 1979. Elsewhere in her article she also needs to refer to a subset covering the winter months of the years 1981-1985. • Write down the references that Alice Snowe needs to give in her article. May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  13. 13. APA Basic form: • Rightsholder. (Year). Title of data set (Version number) [Description of form]. Location: Name of producer. or Rightsholder. (Year). Title of data set (Version number) [Description of form]. Retrieved from http:// • University of Poppleton (2011). Precipitation measurements 1905-2010 taken at Western Bank weather station [Data files and documentation]. Poppleton: The University of Poppleton, Meteorological Service. May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  14. 14. DataCite • DataCite (http://www.datacite.org) is a not-for-profit organisation that aims to promote and support the sharing of research data • They are developing an infrastructure that supports methods of data citation, discovery, and access • They are currently leveraging the DOI (Digital Object Identifier) infrastructure, which is also used for research articles • They can provide DOIs for datasets • DataCite DOIs have to resolve to a public landing page with information about the dataset and a direct link to it May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  15. 15. DataCite Basic form: • Creator (PublicationYear): Title. Version. Publisher. ResourceType. Identifier • Version and ResourceType are optional elements • For citation purposes, DataCite recommends that DOI names are displayed as linkable, permanent URLs • More info in DataCite (2011) • University of Poppleton (2011): Precipitation measurements 1905-2010 taken at Western Bank weather station. Meteorological service, The University of Poppleton. http://dx.doi.org/10.1594/UoP.MS.298 May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  16. 16. Activity 2: Data citation • What practical issues did you encounter when writing the references for Alice Snowe’s research paper? How could these issues be solved? May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  17. 17. Data Citation • Issues include (Ball & Duke, 2011a and b): – At what granularity should data be made citeable? – How to credit each contributor in a dataset that is assembled from very many contributions? – Where in a research paper should a data citation be given (e.g. a paper describing a dataset versus subsequent papers using it)? – What to do with frequently updated data? May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  18. 18. REFERENCES May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  19. 19. References • American Psychological Association (2010). Publication Manual of the American Psychological Association (6th edition). Washington, DC: American Psychological Association, pp. 210-211. • Ball, A., & Duke, M. (2011a). Data Citation and Linking. DCC Briefing Papers. Edinburgh: Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/resources/briefing- papers/introduction-curation/data-citation-and-linking • Ball, A., & Duke, M. (2011b). How to Cite Datasets and Link to Publications. DCC How-To Guides. Edinburgh: Digital Curation Centre. Retrieved from http://www.dcc.ac.uk/resources/how- guides/cite-datasets May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose
  20. 20. References • DataCite (2011). DataCite Metadata Schema for the Publication and Citation of Research Data. Version 2.2. London: DataCite. Retrieved from http://schema.datacite.org/meta/kernel-2.2/doc/DataCite- MetadataKernel_v2.2.pdf. doi:10.5438/0005 • DataCite (n.d.). Why cite data? Hannover. Retrieved from http://datacite.org/whycitedata • Rumsey, S. (2012). Just enough metadata: Metadata for research datasets in institutional data repositories [PowerPoint presentation]. Oxford: The University of Oxford. Retrieved from http://damaro.oucs.ox.ac.uk/docs/Just%20enough%20metadata%2 0v3-1.pdf • UK Data Archive (n.d.). Citing Data. Colchester. Retrieved from http://www.data-archive.ac.uk/conditions/citing-data May-15 Learning material produced by RDMRose http://www.sheffield.ac.uk/is/research/projects/rdmrose

×