www.atmire.com
Bram Luyten
Tom Desair
Open Repositories 2017
Archiving Sensitive Data
Unfortunately
Not all repository content is equally open
Overview
Metadata based access control
Strategies for dealing with sensitive data
Actionable takeaways
Question
How many authorization groups are there in your
DSpace?
Metadata based access control
Using EPerson characteristics and Item
characteristics to determine whether the
EPerson is entitled to access the item.
Example:
An exact match between a social security
number or an email address on the EPerson
and on the metadata of the item.
Advantages
Scale
No identified limits on number of EPeople, items or
groups
Performance
No identified limits on search or item access volumes
Can be managed outside of DSpace
Both EPerson and Item metadata can be sourced
externally
Configurable
Configuration example
<group-policy groupName="Autenticated_eID_Users">
<exact-match-policy>
<itemField>dc.contributor.socialsecurity</itemField>
<epersonField>eperson.acl.socialsecurity</epersonField>

<epersonValueExtractor></epersonValueExtractor>
</exact-match-policy>
</group-policy>
Disadvantages
Edit metadata = Edit authorizations
Be very careful of who or what has rights to edit metadata
Your metadata becomes even more sensitive
The impact of unauthorized access to item metadata may
become more severe
Dealing with sensitive data
Strategies
Severity is driven by
probability and impact
Example 1: Unauthorized access
Impact


High if you're dealing with sensitive data
Low if you're dealing with public/non-sensitive data

Probability


The harder it is for people to access your system, the lower
The longer you wait with security updates, the higher
Example 2: Losing all your data
Impact


High if you're dealing with data that only exists in one place
Low(er) if data exists in multiple places

Probability


What does "losing" mean?
What does "all" mean?
Actionable takeaways
Code available on 

https://github.com/milieuinfo/dspace54-atmire/
Feel free to (re)use what you want
Assess the severity of your risks by thinking about
the associated probability and impact.

Credits
Images
Keys https://www.flickr.com/photos/bohman/
Tsunami https://www.flickr.com/photos/groovyanddreamy/
Pick it https://www.flickr.com/photos/shanepope/

Archiving Sensitive Data