Future agenda: repositories, and the research process
Non-Standard Archiving of Research
Future agenda: repositories, and the research process
Tom Phillips, A
Martin Donnelly, Digital Curation Centre, University of Edinburgh
Nottingham Trent University, 13 May 2014
What is research data management?
“the active management and appraisal
of data over the lifecycle of scholarly
and scientific interest”
Data management is a part of good
- RCUK Policy and Code of Conduct on the
Governance of Good Research Conduct
The old way of doing (science)
1. Researcher collects data (information)
2. Researcher interprets/synthesises data
3. Researcher writes paper based on data
4. Paper is published (and preserved)
5. Data is left to benign neglect, and
eventually ceases to be accessible
The new way of doing (science)
Other models are available…
Ellyn Montgomery, US Geological Survey
Drivers and benefits of RDM
TRANSPARENCY: The evidence that underpins research
can be made open for anyone to scrutinise, and attempt
to replicate the findings of others.
EFFICIENCY: Data collection can be funded once, and
used many times for a variety of purposes.
RISK MANAGEMENT: A pro-active approach to data
management reduces the risk of inappropriate
disclosure of sensitive data, whether commercial or
PRESERVATION: Lots of data is unique, and can only be
captured once. If lost, it can’t be replaced.
Definitions vary from discipline to discipline, and from funder to funder…
Here’s a science-centric definition:
“The recorded factual material commonly accepted in the scientific community as
necessary to validate research findings.” (US Office of Management and Budget,
[Addendum: This policy applies to scientific collections, known in some disciplines
as institutional collections, permanent collections, archival collections, museum
collections, or voucher collections, which are assets with long-term scientific value.
(US Office of Science and Technology Policy, Memorandum, 20 March 2014)]
And another from the visual arts:
“Evidence which is used or created to generate new knowledge and
interpretations. ‘Evidence’ may be intersubjective or subjective; physical or
emotional; persistent or ephemeral; personal or public; explicit or tacit; and is
consciously or unconsciously referenced by the researcher at some point during
the course of their research.”
(Leigh Garrett, KAPTUR project: see http://kaptur.wordpress.com/
So what is ‘data’ exactly?
Scientific and other methods…
The scientific method is a body of
techniques for investigating phenomena,
acquiring new knowledge, or correcting and
integrating previous knowledge.
To be termed scientific, a method of inquiry
must be based on empirical and measurable
evidence subject to specific principles of
The Oxford English Dictionary defines the
scientific method as: “a method or procedure
that has characterized natural science since
the 17th century, consisting in systematic
observation, measurement, and experiment,
and the formulation, testing, and modification
An art methodology differs from a science
methodology, perhaps mainly insofar as the artist is
not always after the same goal as the scientist. In art it
is not necessarily all about establishing the exact truth
so much as making the most effective form (painting,
drawing, poem, novel, performance, sculpture, video,
etc.) through which ideas, feelings, perceptions can
be communicated to a public. With this purpose in
mind, some artists will exhibit preliminary sketches
and notes which were part of the process leading to
the creation of a work. Sometimes, in Conceptual art,
the preliminary process is the only part of the work
which is exhibited, with no visible end result displayed.
In such a case the "journey" is being presented as
more important than the destination.
There’s nothing new about data re-use in the Arts and Humanities;
it’s an integral part of the culture, and always has been…
Think Kristeva’s intertextuality, Barthes’ ‘galaxy of signifiers’,
Shakespeare’s plots, Lanark’s assorted ‘plagiarisms’, Edwin Morgan’s
‘found’ newspaper poems, Marcel Duchamp, variations on a theme,
collage and intermedia art, T.S. Eliot, sampling/hip-hop, etc etc
However, it’s often more fraught than data re-use in other areas
(such as the Sciences)
For starters, people tend not to think of their sources or influences
as ‘data’, and the value and referencing systems are quite different
Furthermore, practice/praxis based research is pretty much the sole
preserve of the Humanities, and research/production methods are
not always rigorously methodical or linear…
Strengths and weaknesses re. data in the Arts and
Some characteristics of Arts and Humanities data are likely to
require a different kind of handling from that afforded to other
Arts ‘data’ is often personal, and creative data in particular may not
be factual in nature. Furthermore, it may be quite valuable or
precious to its creator. What matters most may not be the content
itself, but rather the presentation, the arrangement, the quality of
This tends to be why Open Access embargoes are often longer in
the Arts and Humanities than other areas
Digital ‘data’ emerging in the Arts is as likely to be an outcome of the
creative research process as an input to a workflow. This is at odds
with the scientific method, and how most RDM resources are
Problems re. data in the Arts and Humanities
Are the goals – or indeed the concepts – of evidence, facts, validation, replication
still central in disciplines reliant on subjectivity, interpretation, argument and
qualities of expression?
How do we identify, preserve and share ephemera, emotions, the unconscious…?
How do we protect rights around creative data? What are the financial/ ownership
issues accompanying creative / Arts research?
Is it clear where creative research begins and ends? How can we differentiate
between funded research and unfunded personal work?
What complexities are introduced by practice-driven research?
To what extent is non-digital material a problem? Can we share approaches to this
with other subject areas (e.g. biology, geology)?
What other characteristics do Arts and Humanities data have in common with
those of the Sciences? Which other disciplines share these issues more generally?
A few questions around data in the Arts and
Business case (“could anyone die?”)
Retention and embargo periods
Respect des fonds?
Multiplicity of (file) formats and creation/storage media
Linking analogue and digital, structuring collections
Commercial considerations and IPR… personal data?
Access arrangements / digitisation
Metadata (NISO): descriptive (for discovery), administrative (for reuse),
structural (for inter-relating objects) – obviously this also costs money…
Archiving issues around Arts and Humanities data
Need – what do we need to archive? Is it evidence without which the
research outcomes are in doubt?
Want – do we want to archive materials for other reasons? Does
preserving preparatory/developmental work provide a richer
experience/understanding of the creative work and process? How do
we make a business case for this?
Many creative researchers are on fractional contracts, and there is not
always a clear delineation between professional work and personal
practice. Where and how do we locate the line?
More practically, the same notebook or sketchbook may be used for
both professional and personal purposes. Its contents may be messy,
Is an artwork ever finished, or just abandoned? (c.f. Paul Valéry) How
do we know? Sometimes the demo version is better…
Possible discussion points