ARIADNE is funded by the European Commission's Seventh Framework Programme
Data sharing
Kate Fernie
Overview
• Archaeology data: rights and licences
• Open access
• Open data
• Open licences
• Barriers and benefits of data sharing
• Plan ahead: considerations
• Group Discussion Exercise
Archaeology data and rights
The processes and activities involved in archaeological result can result in the
generation of Intellectual Property Rights at different stages
• The actors may include:
– Owners/managers of the monument, site or artefact, e.g. national heritage
organisation, museum, private persons.
– Funding bodies, who may own the IPR in the content and assign licences for its use
– Organisations involved in data capture and post-processing of the content
– Researchers
• Agreements may cover physical access to the monument, the IPR in the content
and licences for its use
• Content includes text documents, images, 3D models, videos and original data
created by the archaeological research.
• Metadata is provided for discovery and to promote re-use of the content is
generally openly licenced
Copyright and research data
• Copyright protects the expression of an idea
– not the idea itself.
• Data is not covered by copyright
– but the arrangement of data in a spreadsheet or database is
• Copyright is assigned when a creative work is produced
– Funding bodies may request copyright is assigned to themselves
– Employers may claim copyright of works produced by their staff.
• How long copyright lasts varies according to the type of work and the
country
• Copyright law varies from country to country.
• Different institutions have different copyright clauses in their employment
contracts.
“Intellectual property rights, very broadly, are rights granted to creators and owners
of works that are the result of human intellectual creativity”
Licences
• Copyright protects your work
• Licences are your way of saying how people may use it and
cover:
– Attribution (of you as the author of the work)
– Permitted uses (e.g. education, commercial uses, open access)
• Limitations on use e.g. publication of an image in a journal article
– Derivatives – whether people can make copies, remix or use the
content to create new works
– Share alike - a license condition that specifies that new works must
be licensed under the same terms
Some context: open access to scientific data
• The European Union promotes open access to
publications and data with the aim of:
– Enabling researchers to build on previous research
– Fostering collaboration between researchers
– Accelerating innovation
– Involving citizens and society
What is open access?
Open access can be defined as providing on-
line access to scientific information that is free
of charge to the end-user and is re-usable
Main routes to open access:
• Open access scientific journals
• Open access data repository
Archaeology and open access
Atkinson, M. and Preston S. (2015).
Heybridge: A late Iron Age and
Roman settlement. Excavations at
Elms Farm 1993-5. Volume 2,
Internet Archaeology 40.
http://dx.doi.org/10.11141/ia.40.1
Open access publication Open access data archive
Related digital archive: Essex County Council
(2015) Elms Farm Portfolio Project [data-set]:
http://dx.doi.org/10.5284/1021668
Open data: accessible online
• Accessible online
– The dataset is available online via a service
– Users may need to register to access the data
http://dans.knaw.nl/en/search
Open data: re-usable
• Open data is re-usable
– Available in an open format that allows for re-analysis, e.g. the
data is in a spreadsheet and not locked in a PDF document
– Is more than the summarized data in publications (i.e. figures,
charts, etc.)
– It may be original raw data or have been cleaned, or normalized
when deposited
Open data: open licences
• Open licences permit re‐use of data for free
• Includes any royalty‐free copyright licence
• Example licences:
– CC0 - Creative Commons Zero (Public Domain dedication without
attribution)
– CC-BY – Creative Commons Attribution
– CC-BY-SA – Creative Commons Share-alike
– ODC PDDL - Open Data Commons Public Domain Dedication and
Licence
– ODbL – Open Data Commons Open Database licence
• http://creativecommons.org/licenses/
• http://opendatacommons.org/
Open licenses: example of re-use
Barriers to data sharing
• Priority of published papers / little academic reward for
development and sharing of datasets
• Existing copyrights, confidential and sensitive data
• Concerns of researchers that data could be scooped, misused or
misinterpreted
• Potential reputational risk (e.g. data quality, errors,…)
• Required effort to share re-usable data, (incl. formatting,
metadata creation, licensing etc.)
• Perceived lack of appropriate data archives (trusted, sustainable,
...)
Sensitive data
Archaeological datasets may sometimes include sensitive or confidential information
relating to individuals but which provides valuable historical or contextual information.
• Personal Data is data relating to living individuals which identifies them: name, age,
sex, address, photographs, etc.
• Sensitive Personal Data is data that may incriminate a person such as:
– Race, ethnic origin, political opinion, religious beliefs, physical/mental health,
sexual orientation, criminal proceedings or convictions.
• Confidential data includes:
– Data given in confidence, or agreed to be kept confidential (i.e. not released into
public domain).
– Data covered by ethical guidelines, legal requirements, or research consent
forms.
• Sharing of such data can often be achieved using a combination of obtaining
consent, anonymising data and regulating data access.
Benefits of data sharing
• Access to data is a scholarly
communication
• Citation increases the visibility
• Motivates/inputs new research
• Verification of research/research
integrity
• Stimulates new collaborations
• Re-use/-purposing of well curated
data increases research efficiency
Charles Beagrie: Keeping Research Data Safe (KRDS) benefits framework
How to reap the benefits?
• Deposit data in a recognised repository which
– provides unique persistent identifiers (e.g. DOIs)
– requires users to follow citation standards (e.g. DataCite)
• Provide good metadata – “no pain, no gain”
– Key for data re-use without direct contact with creator
– Include costs of preparing data and metadata for
publication in requests for project funding
• Apply an open license that allows reuse
• Cite your own data!
Attribution of Research Data
• Example licences: CC-BY and ODC-BY
• Attribution for a dataset citation:
– For example: Evans, T.N.L. and R.H. Moore (2014) 'The Use of PDF/A
in Digital Archives: A Case Study from Archaeology' International
Journal of Digital Curation . Vol. 9, No. 2, pp. 123-138. DOI:
10.2218/ijdc.v9i2.267
• Ask for a persistent identifer
– resolves to an Internet location, e.g. Handles, Archive Resource Keys
(ARKs) Persistent URLs (PURLs) and Digital Object Identifiers (DOI)
• Attribution helps:
– The data to be tracked and its impact
– Re-use and verification of data
What we mean by open data
• Accessible online
• Free at the point of use
• Reusable
• Openly licensed (e.g. CC-BY, CC-BY-SA)
– Licence allows derivatives to be created
Group discussion
For your own project data:
• What data will be produced/could be archived?
• Are there any barriers to you sharing the data?
• What steps will you need to carry out to do this?
• How might open access benefit your research?
Acknowledgements
ARIADNE is a project funded by the European Commission under the Community’s
Seventh Framework Programme, contract no. FP7-INFRASTRUCTURES-2012-1-
313193.
Teaching Materials for Research Data Management in Archaeology created by
Lindsay Lloyd-Smith (2011) as part of the JISC funded DataTrain project based at the
Cambridge University Library
ARIADNE, 2014, D3.3 Report on data sharing policies: http://www.ariadne-
infrastructure.eu/Resources/D3.3-Report-on-data-sharing-policies
3D ICONS, 2014, Guidelines: http://www.3dicons-project.eu/eng/Guidelines-Case-
Studies/Guidelines2

Ariadne: Data Sharing

  • 1.
    ARIADNE is fundedby the European Commission's Seventh Framework Programme Data sharing Kate Fernie
  • 2.
    Overview • Archaeology data:rights and licences • Open access • Open data • Open licences • Barriers and benefits of data sharing • Plan ahead: considerations • Group Discussion Exercise
  • 3.
    Archaeology data andrights The processes and activities involved in archaeological result can result in the generation of Intellectual Property Rights at different stages • The actors may include: – Owners/managers of the monument, site or artefact, e.g. national heritage organisation, museum, private persons. – Funding bodies, who may own the IPR in the content and assign licences for its use – Organisations involved in data capture and post-processing of the content – Researchers • Agreements may cover physical access to the monument, the IPR in the content and licences for its use • Content includes text documents, images, 3D models, videos and original data created by the archaeological research. • Metadata is provided for discovery and to promote re-use of the content is generally openly licenced
  • 4.
    Copyright and researchdata • Copyright protects the expression of an idea – not the idea itself. • Data is not covered by copyright – but the arrangement of data in a spreadsheet or database is • Copyright is assigned when a creative work is produced – Funding bodies may request copyright is assigned to themselves – Employers may claim copyright of works produced by their staff. • How long copyright lasts varies according to the type of work and the country • Copyright law varies from country to country. • Different institutions have different copyright clauses in their employment contracts. “Intellectual property rights, very broadly, are rights granted to creators and owners of works that are the result of human intellectual creativity”
  • 5.
    Licences • Copyright protectsyour work • Licences are your way of saying how people may use it and cover: – Attribution (of you as the author of the work) – Permitted uses (e.g. education, commercial uses, open access) • Limitations on use e.g. publication of an image in a journal article – Derivatives – whether people can make copies, remix or use the content to create new works – Share alike - a license condition that specifies that new works must be licensed under the same terms
  • 6.
    Some context: openaccess to scientific data • The European Union promotes open access to publications and data with the aim of: – Enabling researchers to build on previous research – Fostering collaboration between researchers – Accelerating innovation – Involving citizens and society
  • 7.
    What is openaccess? Open access can be defined as providing on- line access to scientific information that is free of charge to the end-user and is re-usable Main routes to open access: • Open access scientific journals • Open access data repository
  • 8.
    Archaeology and openaccess Atkinson, M. and Preston S. (2015). Heybridge: A late Iron Age and Roman settlement. Excavations at Elms Farm 1993-5. Volume 2, Internet Archaeology 40. http://dx.doi.org/10.11141/ia.40.1 Open access publication Open access data archive Related digital archive: Essex County Council (2015) Elms Farm Portfolio Project [data-set]: http://dx.doi.org/10.5284/1021668
  • 9.
    Open data: accessibleonline • Accessible online – The dataset is available online via a service – Users may need to register to access the data http://dans.knaw.nl/en/search
  • 10.
    Open data: re-usable •Open data is re-usable – Available in an open format that allows for re-analysis, e.g. the data is in a spreadsheet and not locked in a PDF document – Is more than the summarized data in publications (i.e. figures, charts, etc.) – It may be original raw data or have been cleaned, or normalized when deposited
  • 11.
    Open data: openlicences • Open licences permit re‐use of data for free • Includes any royalty‐free copyright licence • Example licences: – CC0 - Creative Commons Zero (Public Domain dedication without attribution) – CC-BY – Creative Commons Attribution – CC-BY-SA – Creative Commons Share-alike – ODC PDDL - Open Data Commons Public Domain Dedication and Licence – ODbL – Open Data Commons Open Database licence • http://creativecommons.org/licenses/ • http://opendatacommons.org/
  • 12.
  • 13.
    Barriers to datasharing • Priority of published papers / little academic reward for development and sharing of datasets • Existing copyrights, confidential and sensitive data • Concerns of researchers that data could be scooped, misused or misinterpreted • Potential reputational risk (e.g. data quality, errors,…) • Required effort to share re-usable data, (incl. formatting, metadata creation, licensing etc.) • Perceived lack of appropriate data archives (trusted, sustainable, ...)
  • 14.
    Sensitive data Archaeological datasetsmay sometimes include sensitive or confidential information relating to individuals but which provides valuable historical or contextual information. • Personal Data is data relating to living individuals which identifies them: name, age, sex, address, photographs, etc. • Sensitive Personal Data is data that may incriminate a person such as: – Race, ethnic origin, political opinion, religious beliefs, physical/mental health, sexual orientation, criminal proceedings or convictions. • Confidential data includes: – Data given in confidence, or agreed to be kept confidential (i.e. not released into public domain). – Data covered by ethical guidelines, legal requirements, or research consent forms. • Sharing of such data can often be achieved using a combination of obtaining consent, anonymising data and regulating data access.
  • 15.
    Benefits of datasharing • Access to data is a scholarly communication • Citation increases the visibility • Motivates/inputs new research • Verification of research/research integrity • Stimulates new collaborations • Re-use/-purposing of well curated data increases research efficiency Charles Beagrie: Keeping Research Data Safe (KRDS) benefits framework
  • 16.
    How to reapthe benefits? • Deposit data in a recognised repository which – provides unique persistent identifiers (e.g. DOIs) – requires users to follow citation standards (e.g. DataCite) • Provide good metadata – “no pain, no gain” – Key for data re-use without direct contact with creator – Include costs of preparing data and metadata for publication in requests for project funding • Apply an open license that allows reuse • Cite your own data!
  • 17.
    Attribution of ResearchData • Example licences: CC-BY and ODC-BY • Attribution for a dataset citation: – For example: Evans, T.N.L. and R.H. Moore (2014) 'The Use of PDF/A in Digital Archives: A Case Study from Archaeology' International Journal of Digital Curation . Vol. 9, No. 2, pp. 123-138. DOI: 10.2218/ijdc.v9i2.267 • Ask for a persistent identifer – resolves to an Internet location, e.g. Handles, Archive Resource Keys (ARKs) Persistent URLs (PURLs) and Digital Object Identifiers (DOI) • Attribution helps: – The data to be tracked and its impact – Re-use and verification of data
  • 18.
    What we meanby open data • Accessible online • Free at the point of use • Reusable • Openly licensed (e.g. CC-BY, CC-BY-SA) – Licence allows derivatives to be created
  • 19.
    Group discussion For yourown project data: • What data will be produced/could be archived? • Are there any barriers to you sharing the data? • What steps will you need to carry out to do this? • How might open access benefit your research?
  • 20.
    Acknowledgements ARIADNE is aproject funded by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-INFRASTRUCTURES-2012-1- 313193. Teaching Materials for Research Data Management in Archaeology created by Lindsay Lloyd-Smith (2011) as part of the JISC funded DataTrain project based at the Cambridge University Library ARIADNE, 2014, D3.3 Report on data sharing policies: http://www.ariadne- infrastructure.eu/Resources/D3.3-Report-on-data-sharing-policies 3D ICONS, 2014, Guidelines: http://www.3dicons-project.eu/eng/Guidelines-Case- Studies/Guidelines2

Editor's Notes

  • #4 Many different actors can be involved involved in an archaeological research projects. They range from the owners or managers of a monument, site or an artefact, to funding bodies (who may specify conditions, e.g. EC funding may require an open access publication), and organisations involved in capturing or processing data (e.g. in 3D digitisation projects), and the individual researchers. Agreements set out at the start of a project can cover physical access to a monument, and the IPR in the content that is produced and licences for its use.
  • #6 Copyright protects your work and I will say more about this in a few moments. First I’m going to talk about licences, which are your way of saying what people may or may not do with your work. Licences cover attribution, permitted uses, derivatives and the terms under which new works may be shared.
  • #9 Internet Archaeology is an open access, independent, not-for-profit journal for archaeology No subscription charges for readers Paper Proposals are accepted on research quality Research sponsors contribute to publication costs, e.g. Research Councils. Papers can be linked to a digital archive that has been deposited in a repository. In the case of the Heybridge monograph, both the publication and the archive are open access.
  • #10 This is what we mean by Open data. Date which is accessible online – may be available via a service that asks you to register first before allowing access to the datasets For example, KNAW-DANS provides an archiving service for the Netherlands. Users can search on the EASY catalogue to find datasets But need to register in order to download.
  • #11 Date which can be re-used – by this we mean data that is available in an open format (not locked in pdf documents or in tables in a publication) that allows re-analysis.
  • #12 Open licences permit the re-use of data for free
  • #13 I’d like to give another example of how openly licenced data can have positive benefits, before looking at some of the barriers to providing access. In this project, an archive deposited by the University of Southampton consisting of scanned line drawings of Roman amphora was used in a crowd-sourcing project. Members of the public were involved in creating scaled drawings which were later used to create 3D models. A nice example of public engagement while creating a data archive with a lot of potential for future research.
  • #14 We recognise that there are barriers to sharing data openly. One of the barriers is the primacy of published papers. By comparison there is currently limited academic reward for researchers who publish open data. Copyright in existing data can be a barrier (clearing the rights to enable open access may be time consuming or otherwise difficult). The dataset may include confidential or sensitive data that is blocked for publication – perhaps for a set period of time. Researchers may have concerns that publishing their data openly may mean that someone else may scoop (reveal) their results first. People worry about data being misused or misinterpreted. They worry about mistakes or errors in the data being spotted, and their reputations being damaged somehow. People worry about the amount of effort involved in preparing data for sharing (capturing metadata, soring out licenses etc). Lack of appropriate digital archives can also be a barrier. These are genuine concerns, but there are also benefits
  • #16 The main benefits of sharing your data include increasing the visibility of your research results. Firstly, providing access to your data is a form of scholarly communitcataion. This increasing the visibility of your research data, you can stimulate new research, verify your results through new collaborations. Enabling the re-use of well curated data increases research – no time is lost by re-creating data or as a result of a turnover in the research team.
  • #19 This is what we mean by Open data. Date which is accessible online – may be available via a service that asks you to register first before allowing access to the datasets Open data should be free at the point of use. Date which can be re-used – by this we mean data that is available in an open format (not locked in pdf documents or in tables in a publication) that allows re-analysis. Data which is made available under an open licence which allows derivatives to be created. Creative Commons licences such as CC-BY (by attribution) or CC-BY-SA (by attribution, share under alike conditions)