SlideShare a Scribd company logo
1 of 31
Download to read offline
dans.knaw.nl
DANS is een instituut van KNAW en NWO
My data manager is a robot!
Mass ingests and migrations & network integrations
Valentijn Gilissen, MA: Data Manager / Preservation Officer
April 2019, CAA, Krakow
Use-cases
• The SWORD-ingest of Dutch archaeological datasets by the network of governmental
depots into the central DANS hub.
• Mass migrations and transformations of archived data to new standards.
• The promotion and integration of local data from the Portable Antiquities of the
Netherlands (PAN) in an international network, making use of thesauri, data mining
and Linked Open-Data techniques.
“How is humanity saved if it's not
allowed to... evolve?”
--Ultron
Avengers: Age of Ultron.
Directed by Joss Whedon. Marvel Studios, 2015
To support the ingest and validation of
increasing volumes of data, the role of
the data manager will need to adapt.
--Valentron
Institute of
Dutch Academy
and Research
Funding
Organisation
(KNAW & NWO)
since 2005
First predecessor
dates back to
1964 (Steinmetz
Foundation),
Historical Data
Archive 1989
Mission: promote
and provide
permanent
access to digital
research
resources
https://dans.knaw.nl
Data Archiving and Networked Services
https://easy.dans.knaw.nl
https://dataverse.nl
https://www.narcis.nl
DANS core data services
NARCIS: Gateway to scholarly
information in the Netherlands
DataverseNL for short- and
mid-term data storage
EASY: certified long-term Electronic
Archiving System for self-deposit
http://www.brill.com/rdj
https://data.mendeley.com/
https://datadryad.org
Background Archive
Research Data Journal for the
Humanities and Social Sciences
Training &
Consultancy
http://datasupport.researchdata.nl/
DANS additional services
Ingest via SWORD protocol
(Simple Web-service Offering
Repository Deposit)
The e-Depot for Dutch Archaeology
>40.000
76%
Field drawings/GIS
Images
Publications
Data tables
Photographs
available without restrictions
archaeological datasets
Open Archival Information System
• Mission to provide the designated community
with trustworthy long-term access to curated
digital resources
• Constant monitoring, planning and maintenance
• Knowledge of/measures against: threats and
risks within systems
• Regular checking and/or certification
• Certificates: 3 standards, 3 levels
What is a ‘Trusted Digital Repository’?
http://www.trusteddigitalrepository.eu
OAIS
(ISO 14721)
Trusted Digital
Repositories:
Attributes and
Responsibilities
TRAC
Audit and
Certification of
Trustworthy Digital
Repositories
(ISO 16363 )
Bodies Providing
Audit And
Certification
(ISO 16919 )
Formal
Certification
See http://wiki.digitalrepositoryauditandcertification.org and
http://www.alliancepermanentaccess.org/membership/member-resources/audit-and-certification
Standards will be available free from http://www.ccsds.org
trustworthiness of digital repositories using ISO
16363.
It covers principles needed to inspire
confidence that third party certification of the
management of the digital repository has been
performed with impartiality, competence,
responsibility, openness, confidentiality, and
responsiveness to complaints
Metrics concerning:
• Organizational Infrastructure
• e.g. The repository shall have a documented history of the
changes to its operations, procedures, software, and
hardware.
• Digital Object Management
• e.g. The repository shall have access to necessary tools
and resources to provide authoritative Representation
Information for all of the digital objects it contains.
• Infrastructure and Security Risk Management
• eg. The repository shall have procedures in place to
evaluate when changes are needed to current
software.
Basic
Certification
Data Seal of
Approval
Extended
Certification
EUROPEAN
FRAMEWORK FOR
AUDIT AND
CERTIFICATION OF
DIGITAL
REPOSITORIES
to be promoted by
the EU
Monitored self-
audit using DSA
metrics
Monitored self-audit using ISO 16363 (or
DIN31644 in Germany)
Audit by
external
auditors
Electronic Archiving SYstemEASY Register Log in
New deposit
BrowseAdvanced search
Search help
Search
Disclaimer
Legal information
Property Rights Statement
How to cite data
https://easy.dans.knaw.nl
CoreTrustSeal/ Nestor Seal 2016
Overview
Cite as
Description
Data files (N)
Electronic Archiving SYstemEASY
Title
Alternative title
Creator
Contributor
Date created
Description
Subject
Coverage
Identifier
Relation
Temporal
Spatial
Type
Format
Language
Upload Files
Qualified Dublin Core metadata
Self-depositing
Access rights
Date available
Remarks
Rights holder
Publisher
Audience
Source
Date
Data-managing
• Check Dublin Core, edit/modify where necessary
• Assign project codes (if required)
• Download files, check for completeness / privacy-sensitive data
• Migrate files to preferred formats (if required/necessary)
• Modify directory structure (if necessary)
• Upload preferred formats
• Check individual file metadata, edit/modify if necessary
• Add individual file metadata
• Publish files (set visibility/accessibility rights)
• Create a ‘Jumpoff’ presentation page
• Check workflow
• Publish dataset
• Relate dataset to related datasets or web pages
• End administration
Case 1: I, Robot
The SWORD-ingest of Dutch archaeological datasets by the network of
governmental depots into the central DANS hub.
“I’d give you advice, but you wouldn’t listen. No one ever does.”
--Marvin the Paranoid Android
(Adams, Douglas, 1952-2001. The Hitchhiker's Guide to the Galaxy;
New York :Harmony Books, 1980. Print.)
Reality: guidance => monitoring => feedback => effect change
--Valentijn the Preservation Officer
Provincial Depots
Front-office/Back-office model
PDBS
Provinciaal Depot Beheer Systeem
(Provincial Depot Management System)
Open Archival Information System
Persistent Identifier Citation
Front-office
Machine to Machine
SWORD
OAI-PMH
REST-API
P
R
O
D
U
C
E
R
C
O
N
S
U
M
E
R
Open Archival Information System
Persistent Identifier Citation
Front-office
Machine to Machine
SWORD
OAI-PMH
REST-API
P
R
O
D
U
C
E
R
C
O
N
S
U
M
E
R
Open Archival Information System
Persistent Identifier Citation
Front-office
Machine to Machine
SWORD
OAI-PMH
REST-API
P
R
O
D
U
C
E
R
C
O
N
S
U
M
E
R
Guides to Good Practice
Before depositing
Metadata
What DANS does
Legal aspects
Quoting data
https://dans.knaw.nl/en
Deposit => Read more about depositing data
File Formats
http://www.parthenos-project.eu/portal/policies_guidelines
Documentation
During depositing
After depositing
Case 2: Transformers!
Mass migrations and transformations of archived data to new standards.
“Upgrading is compulsory.”
--the Cybermen
Doctor Who, BBC Studios, 1963-2019
Reality: guiding => monitoring => migrating where relevant => update documents
--the Archiving staff (Trusted Digital Repositories)
Preferred Formats
Preferred Formats
Non-preferred format(s)
As a general guideline, DANS considers that the file
formats best suited for longtime preservation and
accessibility are file formats which:
-are commonly used
-have open specifications
-are independent of specific software, developers or
suppliers
Archaeological data deposited in EASY
Publications
CAD drawings/GIS maps
Field drawings (scans)
Data tables
(databases / spreadsheets)
Photographs
Reports
Vector Images
JPEG + TIFF
JPEG + TIFF
SVG
CSV
PDF/A
PDF/A
DXF R12 / MID+MIF
CSV
PDF/AWord, WordPerfect
Access
Mass migrations to Preferred Formats
Mass migrations to Preferred Formats
File identification
(mediatype)
Selection filter:
visible files
Extraction from
archive (Python)
Checksum
validation Checksum
validation
Checksum
validation
Checksum
validation
Double conversion
(Python)
Adding
provenance
metadata to
file ID’s
Generatin
g logfiles
Archival
storage
Case 3: Automatic for the People
The promotion and integration of local data from the Portable Antiquities of
the Netherlands (PAN) in an international network, making use of thesauri,
data mining and Linked Open-Data techniques.
“I am fluent in over six million forms of communication.”
--Protocol droid C3PO
Star Wars: Episode VI -Return of the Jedi. Directed by Richard Marquand. Lucasfilm Ltd. LCC, 1983
Reality: mapping metadata => harvesting => adding sources => enable access
--Protocol-operating data manager V@L3NT1JN
PAN – Portable Antiquities of the Netherlands
CARARE-project:
‘Open Access’ archaeological publications visible in Europeana
http://www.carare.eu/
ARIADNE-portal:
http://portal.ariadne-infrastructure.eu/
Initiatives
Researchers Excavators
Depot holders
National
Initiatives
International
portals
International
searching &
downloading
searching &
downloading
searching &
downloading
depositing
depositingdepositing
depositing
depositing
depositing
OAI-PMH
harvesting
Depositing
via SWORD
General contact:
Info@DANS.KNAW.NL
Head Data Archive:
Hella.Hollander@DANS.KNAW.NL
Senior Data Steward / Preservation
Officer:
Valentijn.Gilissen@DANS.KNAW.NL
Watch our videos on YouTube:
https://www.youtube.com/user/
DANSDataArchiving
Thanks for listening!

More Related Content

Similar to 02 2019 caa_krakowvg

TDWI Checklist Report: Active Data Archiving
TDWI Checklist Report:  Active Data ArchivingTDWI Checklist Report:  Active Data Archiving
TDWI Checklist Report: Active Data ArchivingRainStor
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
Educause Annual 2007
Educause Annual 2007Educause Annual 2007
Educause Annual 2007Neil Matatall
 
Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...Thorsten Huelsmann
 
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptxLaurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptxFIWARE
 
Big data in freight transport
Big data in freight transportBig data in freight transport
Big data in freight transportPer Olof Arnäs
 
Meeting the future - Big data in freight transport
Meeting the future - Big data in freight transportMeeting the future - Big data in freight transport
Meeting the future - Big data in freight transportPer Olof Arnäs
 
KELLY_MANOVERV.PDF
KELLY_MANOVERV.PDFKELLY_MANOVERV.PDF
KELLY_MANOVERV.PDFHernanKlint
 
The e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectThe e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectLeandro Ciuffo
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13Kristi Holmes
 
de theory and practice of digital preservation
de theory and practice of digital preservationde theory and practice of digital preservation
de theory and practice of digital preservationFIAT/IFTA
 
170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-apanttipursula
 
Powering the Future of Data  
Powering the Future of Data	   Powering the Future of Data	   
Powering the Future of Data  Bilot
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amirydatastack
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 

Similar to 02 2019 caa_krakowvg (20)

TDWI Checklist Report: Active Data Archiving
TDWI Checklist Report:  Active Data ArchivingTDWI Checklist Report:  Active Data Archiving
TDWI Checklist Report: Active Data Archiving
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Datamining
DataminingDatamining
Datamining
 
Educause Annual 2007
Educause Annual 2007Educause Annual 2007
Educause Annual 2007
 
Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...
 
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptxLaurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
 
Big data in freight transport
Big data in freight transportBig data in freight transport
Big data in freight transport
 
Wiser2009 Luis Martinez
Wiser2009 Luis MartinezWiser2009 Luis Martinez
Wiser2009 Luis Martinez
 
Meeting the future - Big data in freight transport
Meeting the future - Big data in freight transportMeeting the future - Big data in freight transport
Meeting the future - Big data in freight transport
 
Iot presentation
Iot presentationIot presentation
Iot presentation
 
KELLY_MANOVERV.PDF
KELLY_MANOVERV.PDFKELLY_MANOVERV.PDF
KELLY_MANOVERV.PDF
 
The e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectThe e-Ciber Superfacility Project
The e-Ciber Superfacility Project
 
Information Systems
Information SystemsInformation Systems
Information Systems
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
de theory and practice of digital preservation
de theory and practice of digital preservationde theory and practice of digital preservation
de theory and practice of digital preservation
 
170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap
 
Powering the Future of Data  
Powering the Future of Data	   Powering the Future of Data	   
Powering the Future of Data  
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Burton - Security, Privacy and Trust
Burton - Security, Privacy and TrustBurton - Security, Privacy and Trust
Burton - Security, Privacy and Trust
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 

More from ariadnenetwork

ARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfariadnenetwork
 
DANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for ArchaeologistsDANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for Archaeologistsariadnenetwork
 
Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2ariadnenetwork
 
Eaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgariaEaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgariaariadnenetwork
 
Eaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimusEaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimusariadnenetwork
 
Eaa2021 session 476 abstracts
Eaa2021 session 476 abstractsEaa2021 session 476 abstracts
Eaa2021 session 476 abstractsariadnenetwork
 
Eaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbiaEaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbiaariadnenetwork
 
Eaa2021 476 izeta cattaneo idacordig and suquia
 Eaa2021 476 izeta cattaneo idacordig and suquia Eaa2021 476 izeta cattaneo idacordig and suquia
Eaa2021 476 izeta cattaneo idacordig and suquiaariadnenetwork
 
Eaa2021 476 preserving historic building documentation pakistan
Eaa2021 476 preserving historic building documentation  pakistanEaa2021 476 preserving historic building documentation  pakistan
Eaa2021 476 preserving historic building documentation pakistanariadnenetwork
 
Eaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seaddaEaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seaddaariadnenetwork
 
Preferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed FormatsPreferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed Formatsariadnenetwork
 
Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020ariadnenetwork
 
D6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activitiesD6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activitiesariadnenetwork
 
ARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key ResultsARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key Resultsariadnenetwork
 
ARIADNEplus survey-2019-report
ARIADNEplus survey-2019-reportARIADNEplus survey-2019-report
ARIADNEplus survey-2019-reportariadnenetwork
 
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_2019042404 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424ariadnenetwork
 
03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrap03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrapariadnenetwork
 
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 2019042501 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425ariadnenetwork
 
00 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_201900 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_2019ariadnenetwork
 

More from ariadnenetwork (20)

ARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdf
 
DANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for ArchaeologistsDANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for Archaeologists
 
Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2
 
Eaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgariaEaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgaria
 
Eaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimusEaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimus
 
Eaa2021 session 476 abstracts
Eaa2021 session 476 abstractsEaa2021 session 476 abstracts
Eaa2021 session 476 abstracts
 
Eaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbiaEaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbia
 
Eaa2021 476 izeta cattaneo idacordig and suquia
 Eaa2021 476 izeta cattaneo idacordig and suquia Eaa2021 476 izeta cattaneo idacordig and suquia
Eaa2021 476 izeta cattaneo idacordig and suquia
 
Eaa2021 476 preserving historic building documentation pakistan
Eaa2021 476 preserving historic building documentation  pakistanEaa2021 476 preserving historic building documentation  pakistan
Eaa2021 476 preserving historic building documentation pakistan
 
Eaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seaddaEaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seadda
 
Preferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed FormatsPreferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed Formats
 
Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020
 
D6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activitiesD6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activities
 
ARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key ResultsARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key Results
 
ARIADNEplus survey-2019-report
ARIADNEplus survey-2019-reportARIADNEplus survey-2019-report
ARIADNEplus survey-2019-report
 
05 caa hasil_novak
05 caa hasil_novak05 caa hasil_novak
05 caa hasil_novak
 
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_2019042404 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
 
03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrap03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrap
 
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 2019042501 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
 
00 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_201900 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_2019
 

Recently uploaded

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 

02 2019 caa_krakowvg

  • 1. dans.knaw.nl DANS is een instituut van KNAW en NWO My data manager is a robot! Mass ingests and migrations & network integrations Valentijn Gilissen, MA: Data Manager / Preservation Officer April 2019, CAA, Krakow
  • 2. Use-cases • The SWORD-ingest of Dutch archaeological datasets by the network of governmental depots into the central DANS hub. • Mass migrations and transformations of archived data to new standards. • The promotion and integration of local data from the Portable Antiquities of the Netherlands (PAN) in an international network, making use of thesauri, data mining and Linked Open-Data techniques. “How is humanity saved if it's not allowed to... evolve?” --Ultron Avengers: Age of Ultron. Directed by Joss Whedon. Marvel Studios, 2015 To support the ingest and validation of increasing volumes of data, the role of the data manager will need to adapt. --Valentron
  • 3. Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research resources https://dans.knaw.nl Data Archiving and Networked Services
  • 4. https://easy.dans.knaw.nl https://dataverse.nl https://www.narcis.nl DANS core data services NARCIS: Gateway to scholarly information in the Netherlands DataverseNL for short- and mid-term data storage EASY: certified long-term Electronic Archiving System for self-deposit
  • 5. http://www.brill.com/rdj https://data.mendeley.com/ https://datadryad.org Background Archive Research Data Journal for the Humanities and Social Sciences Training & Consultancy http://datasupport.researchdata.nl/ DANS additional services Ingest via SWORD protocol (Simple Web-service Offering Repository Deposit)
  • 6. The e-Depot for Dutch Archaeology >40.000 76% Field drawings/GIS Images Publications Data tables Photographs available without restrictions archaeological datasets
  • 8. • Mission to provide the designated community with trustworthy long-term access to curated digital resources • Constant monitoring, planning and maintenance • Knowledge of/measures against: threats and risks within systems • Regular checking and/or certification • Certificates: 3 standards, 3 levels What is a ‘Trusted Digital Repository’? http://www.trusteddigitalrepository.eu OAIS (ISO 14721) Trusted Digital Repositories: Attributes and Responsibilities TRAC Audit and Certification of Trustworthy Digital Repositories (ISO 16363 ) Bodies Providing Audit And Certification (ISO 16919 ) Formal Certification See http://wiki.digitalrepositoryauditandcertification.org and http://www.alliancepermanentaccess.org/membership/member-resources/audit-and-certification Standards will be available free from http://www.ccsds.org trustworthiness of digital repositories using ISO 16363. It covers principles needed to inspire confidence that third party certification of the management of the digital repository has been performed with impartiality, competence, responsibility, openness, confidentiality, and responsiveness to complaints Metrics concerning: • Organizational Infrastructure • e.g. The repository shall have a documented history of the changes to its operations, procedures, software, and hardware. • Digital Object Management • e.g. The repository shall have access to necessary tools and resources to provide authoritative Representation Information for all of the digital objects it contains. • Infrastructure and Security Risk Management • eg. The repository shall have procedures in place to evaluate when changes are needed to current software. Basic Certification Data Seal of Approval Extended Certification EUROPEAN FRAMEWORK FOR AUDIT AND CERTIFICATION OF DIGITAL REPOSITORIES to be promoted by the EU Monitored self- audit using DSA metrics Monitored self-audit using ISO 16363 (or DIN31644 in Germany) Audit by external auditors
  • 9. Electronic Archiving SYstemEASY Register Log in New deposit BrowseAdvanced search Search help Search Disclaimer Legal information Property Rights Statement How to cite data https://easy.dans.knaw.nl CoreTrustSeal/ Nestor Seal 2016
  • 10. Overview Cite as Description Data files (N) Electronic Archiving SYstemEASY
  • 11. Title Alternative title Creator Contributor Date created Description Subject Coverage Identifier Relation Temporal Spatial Type Format Language Upload Files Qualified Dublin Core metadata Self-depositing Access rights Date available Remarks Rights holder Publisher Audience Source Date
  • 12. Data-managing • Check Dublin Core, edit/modify where necessary • Assign project codes (if required) • Download files, check for completeness / privacy-sensitive data • Migrate files to preferred formats (if required/necessary) • Modify directory structure (if necessary) • Upload preferred formats • Check individual file metadata, edit/modify if necessary • Add individual file metadata • Publish files (set visibility/accessibility rights) • Create a ‘Jumpoff’ presentation page • Check workflow • Publish dataset • Relate dataset to related datasets or web pages • End administration
  • 13. Case 1: I, Robot The SWORD-ingest of Dutch archaeological datasets by the network of governmental depots into the central DANS hub. “I’d give you advice, but you wouldn’t listen. No one ever does.” --Marvin the Paranoid Android (Adams, Douglas, 1952-2001. The Hitchhiker's Guide to the Galaxy; New York :Harmony Books, 1980. Print.) Reality: guidance => monitoring => feedback => effect change --Valentijn the Preservation Officer
  • 15. Front-office/Back-office model PDBS Provinciaal Depot Beheer Systeem (Provincial Depot Management System)
  • 16. Open Archival Information System Persistent Identifier Citation Front-office Machine to Machine SWORD OAI-PMH REST-API P R O D U C E R C O N S U M E R
  • 17. Open Archival Information System Persistent Identifier Citation Front-office Machine to Machine SWORD OAI-PMH REST-API P R O D U C E R C O N S U M E R
  • 18. Open Archival Information System Persistent Identifier Citation Front-office Machine to Machine SWORD OAI-PMH REST-API P R O D U C E R C O N S U M E R
  • 19. Guides to Good Practice Before depositing Metadata What DANS does Legal aspects Quoting data https://dans.knaw.nl/en Deposit => Read more about depositing data File Formats http://www.parthenos-project.eu/portal/policies_guidelines Documentation During depositing After depositing
  • 20. Case 2: Transformers! Mass migrations and transformations of archived data to new standards. “Upgrading is compulsory.” --the Cybermen Doctor Who, BBC Studios, 1963-2019 Reality: guiding => monitoring => migrating where relevant => update documents --the Archiving staff (Trusted Digital Repositories)
  • 22. Preferred Formats Non-preferred format(s) As a general guideline, DANS considers that the file formats best suited for longtime preservation and accessibility are file formats which: -are commonly used -have open specifications -are independent of specific software, developers or suppliers
  • 23. Archaeological data deposited in EASY Publications CAD drawings/GIS maps Field drawings (scans) Data tables (databases / spreadsheets) Photographs Reports Vector Images JPEG + TIFF JPEG + TIFF SVG CSV PDF/A PDF/A DXF R12 / MID+MIF
  • 25. Mass migrations to Preferred Formats File identification (mediatype) Selection filter: visible files Extraction from archive (Python) Checksum validation Checksum validation Checksum validation Checksum validation Double conversion (Python) Adding provenance metadata to file ID’s Generatin g logfiles Archival storage
  • 26. Case 3: Automatic for the People The promotion and integration of local data from the Portable Antiquities of the Netherlands (PAN) in an international network, making use of thesauri, data mining and Linked Open-Data techniques. “I am fluent in over six million forms of communication.” --Protocol droid C3PO Star Wars: Episode VI -Return of the Jedi. Directed by Richard Marquand. Lucasfilm Ltd. LCC, 1983 Reality: mapping metadata => harvesting => adding sources => enable access --Protocol-operating data manager V@L3NT1JN
  • 27. PAN – Portable Antiquities of the Netherlands
  • 28. CARARE-project: ‘Open Access’ archaeological publications visible in Europeana http://www.carare.eu/
  • 30. Initiatives Researchers Excavators Depot holders National Initiatives International portals International searching & downloading searching & downloading searching & downloading depositing depositingdepositing depositing depositing depositing OAI-PMH harvesting Depositing via SWORD
  • 31. General contact: Info@DANS.KNAW.NL Head Data Archive: Hella.Hollander@DANS.KNAW.NL Senior Data Steward / Preservation Officer: Valentijn.Gilissen@DANS.KNAW.NL Watch our videos on YouTube: https://www.youtube.com/user/ DANSDataArchiving Thanks for listening!