Publication &
Dissemination
of Data
James Baker, Lecturer in Digital
History/Archives
@j_w_baker
slideshare.net/drjwbaker
This work is licensed under a Creative
Commons Attribution-ShareAlike 4.0
International License. Exceptions: quotations,
embeds from external sources, logos, and
marked images.
@j_w_baker
Publication and Dissemination of Data
Session Plan
1) Good places to put your data
2) What to put with your data
3) Examples of best and no-so-best practice
4) Group work: critique
5) Individual work: sign-up, deposit data
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
Zenodo
Good
EC, CERN, OpenAIRE
Generates DOIs
ORCID integration
Well supported
GitHub integration
Well used (50k deposits since 2013)
2GB file size
Flexible
Bad
Slightly clunky
2GB can prove small
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
Figshare
Good
Generates DOIs
ORCID integration
Pro look and feel
Very well used (500k deposits since
2013)
5GB file size
Flexible
Bad
Ownership?
Bit of a free for all
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
UK Data Archive/Service
Good
University of Essex
DOI generation
Longstanding
Official route
Lots of guidance
Bad
Geared to ESRC
Bit of a free for all
Deposit ‘by offer’ only
Getting data out tricky
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
GitHub
Good
URL generation
Iterating option
Version control
Massive userbase
Markdown
Bad
Private company
Deposit hack
Limited metadata
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
Your Institutional Repository
Speak to your librarian!!
@j_w_baker
Publication and Dissemination of Data
1) Places to put your data
Wikidata
Example: https://tools.wmflabs.org/reasonator/?q=Q42
Info: https://www.wikidata.org/wiki/Wikidata:Introduction
Good
Embed data in ecosystem
Wikipedia backend
Linked data
Massive impact
Bad
Not a deposit venue
Who gives you credit?
@j_w_baker
Publication and Dissemination of Data
2) What to put with your data
@j_w_baker
Publication and Dissemination of Data
2) What to put with your data
The core guiding principle is simple: Someone unfamiliar with
your project should be able to look at your computer files and
understand in detail what you did and why [..] Most
commonly, however, that “someone” is you. A
few months from now, you may not remember what you were
up to when you created a particular set of files, or you may not
remember what conclusions you drew. You will either have to
then spend time reconstructing your previous experiments or
lose whatever insights you gained from those experiments.
William Stafford Noble (2009) A Quick Guide to Organizing Computational
Biology Projects. PLoSComputBiol 5(7): e1000424.
doi:10.1371/journal.pcbi.1000424
@j_w_baker
Publication and Dissemination of Data
2) What to put with your data
Essentials
Capture decisions
Capture context
Describe the data
Describe who made the data
Choose a licence
Use a reuseable data format
@j_w_baker
Publication and Dissemination of Data
3) Examples
Vagrant Lives: 14,789 Vagrants
Processed by Middlesex County, 1777-
1786 (version 1.1)
https://zenodo.org/record/31026#.V6CzRo78_6g
@j_w_baker
Publication and Dissemination of Data
3) Examples
A Literary Tour de Force
http://robertdarnton.org/literarytour/booksellers
@j_w_baker
Publication and Dissemination of Data
3) Examples
British Library Printed Music
http://www.bl.uk/bibliographic/download.html#basicmusic
@j_w_baker
Publication and Dissemination of Data
4) Group Work
To do
Pick a resource
github.com/DocumentingHistory/Workshop-Programme
Critique it (15mins)
Prepare to report back(5 mins)
Report back (5 mins each)
Questions to ask
- Is it clear what the data is?
- Do you think you'd be able to
reuse the data easily? (think licence,
format, description)
- Is it easy to give the depositor
credit for their work?
- What does the deposit do well
in your opinion?
- What could be improved?
@j_w_baker
Publication and Dissemination of Data
5) Individual Work
Sign-up, improve, and/or deposit data
If you have some data.. add some documentation to it
If you have some data and some documentation.. package
and upload it somewhere
If you have made a data research plan on day one.. work
through it, map what you need to add into the plan based on
this session
Publication &
Dissemination
of Data
James Baker, Lecturer in Digital
History/Archives
@j_w_baker
slideshare.net/drjwbaker
This work is licensed under a Creative
Commons Attribution-ShareAlike 4.0
International License. Exceptions: quotations,
embeds from external sources, logos, and
marked images.

Publication and Dissemination of Data

  • 1.
    Publication & Dissemination of Data JamesBaker, Lecturer in Digital History/Archives @j_w_baker slideshare.net/drjwbaker This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.
  • 2.
    @j_w_baker Publication and Disseminationof Data Session Plan 1) Good places to put your data 2) What to put with your data 3) Examples of best and no-so-best practice 4) Group work: critique 5) Individual work: sign-up, deposit data
  • 3.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data
  • 4.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data Zenodo Good EC, CERN, OpenAIRE Generates DOIs ORCID integration Well supported GitHub integration Well used (50k deposits since 2013) 2GB file size Flexible Bad Slightly clunky 2GB can prove small
  • 5.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data Figshare Good Generates DOIs ORCID integration Pro look and feel Very well used (500k deposits since 2013) 5GB file size Flexible Bad Ownership? Bit of a free for all
  • 6.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data UK Data Archive/Service Good University of Essex DOI generation Longstanding Official route Lots of guidance Bad Geared to ESRC Bit of a free for all Deposit ‘by offer’ only Getting data out tricky
  • 7.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data GitHub Good URL generation Iterating option Version control Massive userbase Markdown Bad Private company Deposit hack Limited metadata
  • 8.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data Your Institutional Repository Speak to your librarian!!
  • 9.
    @j_w_baker Publication and Disseminationof Data 1) Places to put your data Wikidata Example: https://tools.wmflabs.org/reasonator/?q=Q42 Info: https://www.wikidata.org/wiki/Wikidata:Introduction Good Embed data in ecosystem Wikipedia backend Linked data Massive impact Bad Not a deposit venue Who gives you credit?
  • 10.
    @j_w_baker Publication and Disseminationof Data 2) What to put with your data
  • 11.
    @j_w_baker Publication and Disseminationof Data 2) What to put with your data The core guiding principle is simple: Someone unfamiliar with your project should be able to look at your computer files and understand in detail what you did and why [..] Most commonly, however, that “someone” is you. A few months from now, you may not remember what you were up to when you created a particular set of files, or you may not remember what conclusions you drew. You will either have to then spend time reconstructing your previous experiments or lose whatever insights you gained from those experiments. William Stafford Noble (2009) A Quick Guide to Organizing Computational Biology Projects. PLoSComputBiol 5(7): e1000424. doi:10.1371/journal.pcbi.1000424
  • 12.
    @j_w_baker Publication and Disseminationof Data 2) What to put with your data Essentials Capture decisions Capture context Describe the data Describe who made the data Choose a licence Use a reuseable data format
  • 13.
    @j_w_baker Publication and Disseminationof Data 3) Examples Vagrant Lives: 14,789 Vagrants Processed by Middlesex County, 1777- 1786 (version 1.1) https://zenodo.org/record/31026#.V6CzRo78_6g
  • 14.
    @j_w_baker Publication and Disseminationof Data 3) Examples A Literary Tour de Force http://robertdarnton.org/literarytour/booksellers
  • 15.
    @j_w_baker Publication and Disseminationof Data 3) Examples British Library Printed Music http://www.bl.uk/bibliographic/download.html#basicmusic
  • 16.
    @j_w_baker Publication and Disseminationof Data 4) Group Work To do Pick a resource github.com/DocumentingHistory/Workshop-Programme Critique it (15mins) Prepare to report back(5 mins) Report back (5 mins each) Questions to ask - Is it clear what the data is? - Do you think you'd be able to reuse the data easily? (think licence, format, description) - Is it easy to give the depositor credit for their work? - What does the deposit do well in your opinion? - What could be improved?
  • 17.
    @j_w_baker Publication and Disseminationof Data 5) Individual Work Sign-up, improve, and/or deposit data If you have some data.. add some documentation to it If you have some data and some documentation.. package and upload it somewhere If you have made a data research plan on day one.. work through it, map what you need to add into the plan based on this session
  • 18.
    Publication & Dissemination of Data JamesBaker, Lecturer in Digital History/Archives @j_w_baker slideshare.net/drjwbaker This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Exceptions: quotations, embeds from external sources, logos, and marked images.