Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Preferred Formats = Pre-FAIRed Formats

81 views

Published on

Presentation of the reasoning behind DANS’s preferred formats policy and demonstrated how such a policy contributes to producing FAIR data.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Preferred Formats = Pre-FAIRed Formats

  1. 1. ARIADNEplus is funded by the European Commission’s Horizon 2020 Programme Preferred Formats = Pre-FAIRed Formats Valentijn Gilissen, DANS-KNAW Data Manager Preservation Officer valentijn.gilissen@dans.knaw.nl
  2. 2. Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research resources https://dans.knaw.nl DANS
  3. 3. https://easy.dans.knaw.nl https://dataverse.nl https://www.narcis.nl DANS core data services NARCIS: Gateway to scholarly information in the Netherlands DataverseNL for short- and mid-term data storage EASY: certified long-term Electronic Archiving System for self-deposit
  4. 4. http://www.brill.com/rdj https://data.mendeley.com/ https://datadryad.org Background Archive Research Data Journal for the Humanities and Social Sciences Training & Consultancy http://datasupport.researchdata.nl/ DANS additional services Ingest via SWORD protocol (Simple Web-service Offering Repository Deposit)
  5. 5. Electronic Archiving SYstemEASY Register Log in New deposit BrowseAdvanced search Search help Search Disclaimer Legal information Property Rights Statement How to cite data https://easy.dans.knaw.nl CoreTrustSeal/ Nestor Seal 2016
  6. 6. Overview Cite as Description Data files (N) Electronic Archiving SYstemEASY 2020: 122.000+ datasets of which 67.000+ Archaeology
  7. 7. Self-depositing Title Alternative title Creator / People & organisations Date created Description Subject Identifiers & Relations Temporal coverage Spatial coverage Language Upload Data Qualified Dublin Core metadata Access rights Date available Rightsholder Publisher Audience Date Licence Source Reserve DOI Archaeology-specific metadata Archis zaakID ABR complex ABR periodPersonal data y/n
  8. 8. OAIS in context Persistent Identifier Citation Front-office Machine to Machine SWORD P R O D U C E R C O N S U M E R OAI-PMH ARIADNE-portal: http://portal.ariadne-infrastructure.eu/
  9. 9. => TGA; RAW; CDR => PCX; BMP; PSD => JPG; TIF; PNG Let's save these images in different formats!
  10. 10. Preferred Formats
  11. 11. Before depositing Metadata What DANS does Legal aspects Quoting data https://dans.knaw.nl/en Deposit => Read more about depositing data File Formats Documentation During depositing After depositing PARTHENOS, Hollander, Hella, Morselli, Francesca, Uiterwaal, Frank, Admiraal, Femmy, Trippel, Thorsten, & Di Giorgio, Sara. (2018, December 1). PARTHENOS Guidelines to FAIRify data management and make data reusable. Zenodo. http://doi.org/10.5281/zenodo.2668479 Preferred Formats
  12. 12. Preferred Formats https://dans.knaw.nl/en/about/services/easy/information-about-depositing-data/before-depositing/file-formats
  13. 13. As a general guideline, DANS considers that the file formats best suited for longtime preservation and accessibility are file formats which: -are commonly used -have open specifications -are independent of specific software, developers or suppliers Preferred Formats https://dans.knaw.nl/en/about/services/easy/information-about-depositing-data/before-depositing/file-formats
  14. 14. Preferred Formats Text documents • PDF/A (.pdf) • ODT (.odt) • Microsoft Word (.doc) • Office Open XML (.docx) • Rich Text File (.rtf) • PDF other than PDF/A (.pdf) Geographical Information (GIS) • GML (.gml) • MIF/MID (.mif/mid) • Esri Shapefiles (.shp & related files) • MapInfo (.tab & related files) • KML (.kml) • Esri Geodatabase (.gdb) • Project files/Workspaces (.mxd, .wor, .qgs) Spreadsheets • ODS (.ods) • CSV (.csv) • Microsoft Excel (.xls) • Office Open XML Workbook (.xlsx) • PDF/A (.pdf) Preferred Format Non-preferred Format 3D • WaveFront Object (.obj) • Polygon file format (.ply) • X3D (.x3d) • COLLADA (.dae) • Autodesk FBX (.fbx) • Blender (.blend) • 3D PDF (.pdf)
  15. 15. Preferred Formats Text documents • PDF/A (.pdf) • ODT (.odt) • Microsoft Word (.doc) • Office Open XML (.docx) • Rich Text File (.rtf) • PDF other than PDF/A (.pdf) Geographical Information (GIS) • GML (.gml) • MIF/MID (.mif/mid) • Esri Shapefiles (.shp & related files) • MapInfo (.tab & related files) • KML (.kml) • Esri Geodatabase (.gdb) • Project files/Workspaces (.mxd, .wor, .qgs) Spreadsheets • ODS (.ods) • CSV (.csv) • Microsoft Excel (.xls) • Office Open XML Workbook (.xlsx) • PDF/A (.pdf) Preferred Format Non-preferred Format 3D • WaveFront Object (.obj) • Polygon file format (.ply) • X3D (.x3d) • COLLADA (.dae) • Autodesk FBX (.fbx) • Blender (.blend) • 3D PDF (.pdf)
  16. 16. Preferred Formats Text documents • PDF/A (.pdf) • ODT (.odt) • Microsoft Word (.doc) • Office Open XML (.docx) • Rich Text File (.rtf) • PDF other than PDF/A (.pdf) Geographical Information (GIS) • GML (.gml) • MIF/MID (.mif/mid) • Esri Shapefiles (.shp & related files) • MapInfo (.tab & related files) • KML (.kml) • Esri Geodatabase (.gdb) • Project files/Workspaces (.mxd, .wor, .qgs) Spreadsheets • ODS (.ods) • CSV (.csv) • Microsoft Excel (.xls) • Office Open XML Workbook (.xlsx) • PDF/A (.pdf) Preferred Format Non-preferred Format 3D • WaveFront Object (.obj) • Polygon file format (.ply) • X3D (.x3d) • COLLADA (.dae) • Autodesk FBX (.fbx) • Blender (.blend) • 3D PDF (.pdf)
  17. 17. Preferred Formats Text documents • PDF/A (.pdf) • ODT (.odt) • Microsoft Word (.doc) • Office Open XML (.docx) • Rich Text File (.rtf) • PDF other than PDF/A (.pdf) Geographical Information (GIS) • GML (.gml) • MIF/MID (.mif/mid) • Esri Shapefiles (.shp & related files) • MapInfo (.tab & related files) • KML (.kml) • Esri Geodatabase (.gdb) • Project files/Workspaces (.mxd, .wor, .qgs) Spreadsheets • ODS (.ods) • CSV (.csv) • Microsoft Excel (.xls) • Office Open XML Workbook (.xlsx) • PDF/A (.pdf) Preferred Format Non-preferred Format 3D • WaveFront Object (.obj) • Polygon file format (.ply) • X3D (.x3d) • COLLADA (.dae) • Autodesk FBX (.fbx) • Blender (.blend) • 3D PDF (.pdf)
  18. 18. Preferred Formats Text documents • PDF/A (.pdf) • ODT (.odt) • Microsoft Word (.doc) • Office Open XML (.docx) • Rich Text File (.rtf) • PDF other than PDF/A (.pdf) Geographical Information (GIS) • GML (.gml) • MIF/MID (.mif/mid) • Esri Shapefiles (.shp & related files) • MapInfo (.tab & related files) • KML (.kml) • Esri Geodatabase (.gdb) • Project files/Workspaces (.mxd, .wor, .qgs) Spreadsheets • ODS (.ods) • CSV (.csv) • Microsoft Excel (.xls) • Office Open XML Workbook (.xlsx) • PDF/A (.pdf) Preferred Format Non-preferred Format 3D • WaveFront Object (.obj) • Polygon file format (.ply) • X3D (.x3d) • COLLADA (.dae) • Autodesk FBX (.fbx) • Blender (.blend) • 3D PDF (.pdf)
  19. 19. Preferred Formats Text documents • PDF/A (.pdf) • ODT (.odt) • Microsoft Word (.doc) • Office Open XML (.docx) • Rich Text File (.rtf) • PDF other than PDF/A (.pdf) Geographical Information (GIS) • GML (.gml) • MIF/MID (.mif/mid) • Esri Shapefiles (.shp & related files) • MapInfo (.tab & related files) • KML (.kml) • Esri Geodatabase (.gdb) • Project files/Workspaces (.mxd, .wor, .qgs) Spreadsheets • ODS (.ods) • CSV (.csv) • Microsoft Excel (.xls) • Office Open XML Workbook (.xlsx) • PDF/A (.pdf) Preferred Format Non-preferred Format 3D • WaveFront Object (.obj) • Polygon file format (.ply) • X3D (.x3d) • COLLADA (.dae) • Autodesk FBX (.fbx) • Blender (.blend) • 3D PDF (.pdf)
  20. 20. Spreadsheets / Data Tables to CSV Frankema, Prof. dr. E. (Wageningen University); Woltjer, P. (Wageningen University); Dalrymple-Smith, A. (Wageningen University); Bulambo, L. (Wageningen University) (2017): An Introduction to the African Commodity Trade Database, 1730-2010. DANS. https://doi.org/10.17026/dans-xt9-fzkw
  21. 21. =IF(S8<>"",S8/ToT_regions!$N8*100,"") Spreadsheets Frankema, Prof. dr. E. (Wageningen University); Woltjer, P. (Wageningen University); Dalrymple-Smith, A. (Wageningen University); Bulambo, L. (Wageningen University) (2017): An Introduction to the African Commodity Trade Database, 1730-2010. DANS. https://doi.org/10.17026/dans-xt9-fzkw
  22. 22. Preferred Formats export folder CSV and PDF/A exportsExcel files Frankema, Prof. dr. E. (Wageningen University); Woltjer, P. (Wageningen University); Dalrymple-Smith, A. (Wageningen University); Bulambo, L. (Wageningen University) (2017): An Introduction to the African Commodity Trade Database, 1730-2010. DANS. https://doi.org/10.17026/dans-xt9-fzkw
  23. 23. Original Processed Preferred Formats
  24. 24. 3D
  25. 25. 3D Nagel, D (Wildcard); Cocquyt, T. (HuygensING) (2013): Animated, interactive 3D visualization of a corn mill, after a description by Ramelli. DANS. https://doi.org/10.17026/dans-zzq-ymge
  26. 26. 3D Trognitz, M. (IANUS); Niven, K. (ADS), Gilissen V. (DANS) (2016): 3D Models in Archaeology: A Guide to Good Practice. https://guides.archaeologydataservice.ac.uk/g2gp/3d_Toc
  27. 27. Preferred Formats https://dans.knaw.nl/en/about/services/easy/information-about-depositing-data/before-depositing/file-formats
  28. 28. https://snd.gu.se/en/data-management/guides/file-formats
  29. 29. https://guides.archaeologydataservice.ac.uk/g2gp/
  30. 30. https://www.loc.gov/preservation/resources/rfs/ https://www.loc.gov/preservation/digital/formats/
  31. 31. https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats.aspx
  32. 32. https://www.archivematica.org/en/docs/archivematica-1.10/user-manual/preservation/preservation-planning/ https://wiki.archivematica.org/Significant_characteristics
  33. 33. The PARTHENOS Policy Wizard: http://www.parthenos-project.eu/portal/wizard 33 Policy Wizard DANS guidelines ADS guidelines SND guidelines DANS ADS SND Knowledge Platform SEADDA / ARIADNEplus communities shared guidelines experts … Preferred Formats
  34. 34. Knowledge Platform on GitHub - Proof of Concept Preferred Formats https://dans-labs.github.io/formats/ https://github.com/dataformats
  35. 35. Mass migrations and transformations of archived data to new standards. “Upgrading is compulsory.” --the Cybermen Doctor Who, BBC Studios, 1963-2019 Migrations to Preferred Formats
  36. 36. CSV PDF/AWord, WordPerfect Access Migrations to Preferred Formats
  37. 37. File identification (mediatype) Selection filter: visible files Extraction from archive (Python) Checksum validation Checksum validation Checksum validation Checksum validation Double conversion (Python) Adding provenance metadata to file ID’s Generating logfiles Archival storage Migrations to Preferred Formats
  38. 38. Preservation Watch Free to use images from www.pexels.com (Pixabay / T. Malík)
  39. 39. http://doi.org/10.5281/zenodo.2668479
  40. 40. THANK YOU! ARIADNE is a project funded by the European Commission under the H2020 Programme, contract no. H2020-INFRAIA-2018-1-823914. The views and opinions expressed in this presentation are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission. Contact: hella.hollander@dans.knaw.nl valentijn.gilissen@dans.knaw.nl julian.richards@york.ac.uk www.ariadne-infrastructure.eu www.dans.knaw.nl

×