A DataTags approach to evaluate whether researchers datasets uploaded in digital repositories comply with the new European General Data Protection regulation (GDPR). The presentation provides the protype of a decision tree and of a questionnaire.
2. Personal Data and the DANS archive
• Researcher uploading data is primarily responsible
• DANS can only check marginally
• Tool needed to support decisions on required data
protection – compliant with GDPR and national legislation
• Use Harvard’s DataTags as starting point
http://datatags.org/
3. Privacy: GDPR and Datatags
• General Data Protection Regulation EU –
Passed 14 April 2016
• New European “Law” from May 2018 onward:
– Data minimisation required
– Informed consent important
– Data Protection Officer mandatory
– Right to know (e.g. data leakages)
– High fines for trespassing (data leakage!)
• Implications for sharing data on human subjects?
– Researchers don’t know
– Data repositories don’t know
Data Tagging Approach, initially developed at Harvard
4. Background
Sweeney & Crosas introduced the notion of a datatags repository
• Stores and shares data files in accordance with different security
levels, access requirements and usage agreements
• American laws and legislations of personal data
5. Step by step 1 and 2
1. Identify the relevant articles of GDPR for research and archive
purposes
a) Example: Article 9(2) sets out the circumstances in which the processing of
sensitive personal data (which is otherwise prohibited) may take place:
• Necessary for archiving purposes in the public interest, or scientific and
historical research purposes or statistical purposes in accordance with Article
89(1).
b) Article 17 - right to be forgotten
2. Transformation of relevant articles into questions
a) Were the data processed for archiving in the public interest, scientific or historical
research purposes or statistical purposes?
b) Would you consider the dataset to contain sensitive personal information? [article 9]
6. Step by step 3
3. Decision tree evolution
– Creating routes for questions, ending with tags
– Deciding on tag options and recommendations following each route
– Tree diagram and feedback
7. Step by step 4
4. Linking Tags to protection levels and policies
– Currently 4 levels in DANS-EASY
– No personal data in DataVerseNL ?
Tag type: Authentication When
transmitted
When stored Reading/
downloading rights
0. Public access
(non-personal
data)
None needed Without
encryption
Standard - clear
storage
Everyone (with or
without registration)
1. Basic access
(non-confidential
personal data)
Registration
necessary
With encryption Encrypted
storage
All registered users
2. Restricted
access (sensitive
personal data)
Registration via
repository and
approval of
depositor
With encryption Encrypted
storage
All registered users,
after approval of
depositor and/or
archive
3. Selected access
(highly sensitive
data)
Registration via
repository and
mandatory further
identification
Double
encryption,
secure access
system
NOT accessible
via the internet
and with double
encryption
Authorized (checked)
users only
Adapted from Emily Thomas
8. Step by step 5: the questionnaire
Background
info & legal
explanation
9. Step by step 6, 7, 8
6. Internal testing - feedback
– Results: controller & processor, national laws, roles of
producer & depositor
– Researcher vs. archivist roles - different feedback?
7. Policy Models framework by Mor Vilozni (Harvard)
8. EUDAT Workshop, DANS presentation - feedback!
10. Next steps/possible adjustments
● Separate researcher and archivist trees?
● Addition of any other tags required?
● Which policies connected to which tags?
● Add in national legislation when formulated
● Implement in data submission procedure?
● as separate micro-service?
● recommendation system or a fully automated tagging system?
● Implementation: Harvard Policy Models Framework vs.
Zingtree?
11. Web references
• Questionnaire derived from the decision tree,
including descriptions and links in Zingtree:
https://goo.gl/cBgcmJ
• Graph of decision tree as PDF:
https://goo.gl/fveAZ7
• GDPR DataTags prototype in Harvard’s Policy
Models Framework: http://dvnweb-
vm1.hmdc.harvard.edu/models/GDPR/3/intro