Automating rights decision elag 2017

Automating rights decisions
ELAG 2017, 08-06-2017
Jeffrey van der Hoeven, Rene Wiermer
info@kb.nl

The dream: In reality:
Open access to
everything
for
everybody!
Limited access
due to
copyright
&
contracts

Examples of restrictions (1)
1600 1930 1945 1980 2017
open closed
1400 1900 2017
open restricted
1995
Time ->
digitized
newspapers
digitized
books
no
download

Publisher AReading room only
Journal titels ->
open API key account
datasets
Scientific articles
Publisher B
Publisher Z

Copyright
infringement on
photographs
Newspaper X Newspaper Y

What can I do
with this
publication
about quantum
physics?

User interaction. Here: Accepting terms of uses

Needs 1: more information to the end user
- How do I get access ?
- What can I do with it ?
Improve UX with standardization of rights decisions

Needs 2: One system for multiple applications
- Several websites: Delpher, Geheugen van Nederland, Staten
Generaal Digitaal
- Several API’s: URN-Resolver, OAI-PMH, Search services …
Centralize access decisions for better compliance, management
and reporting
One change = immediately visible in each application

Needs 3: reducing our digitization backlog
- We have a lot of digital content that requires certain restrictions
- How can we make this accessible to anybody who is allowed to
see it ?
- We had an “on/off” infrastructure for most of our content
- Either accessible for everybody or not at all
- Not flexible enough, blocked workflows
Automation of rights decisions based on
- Metadata (Publication date, authors, publisher, type of
material..)
- Location (e.g. reading room)
- Type of user (e.g. researcher)

Simple approach: extra metadata field ?
- For example
- <rights> FREE|RESTRICTED|CLOSED|... </rights>
- <license> CC0|CustomContract|... </license>
- Make decision based on the value of that field
- Works probably fine in a lot scenarios
- But:
- Does not scale with variation depending on context
- “Free for users of type researcher and visitors to the reading room, but not outside
of it”
- Needs maintenance over time
-Missing: why was this decision made ?

Instead: policies as code
- Policy: formalized set of rules regarding a collection of objects
- Decided at runtime -> decisions can change over time
- Follows general lines of thought of the organization: legal
obligations, contracts with publishers, management decisions

Example: Simplest policy
All is freely accesible
return Decision.permit();

Still simple policy
Role-based access (from API-key, username/password auth…)
if (context.roles.contains("DS_METADATA_DTS"))
Access based on publication date
static GregorianCalendar metadataFreeDate=new GregorianCalendar(1940,Calendar.JANUARY,1);
if (attributes.getMetadata().getPublicationDate()?.before(metadataFreeDate.getTime())) {
}
Fallback
return Decision.denied();

Example: Books
Check for location
if (context.location.equals("READING_ROOM")) {
...
}
Demand measures to prevent downloads from frontend
if (attributes.listContainsValue("boeken-leeszaal-kopieerbeveiliging", "ppn",
attributes.getMetadata().getPpn()) ) {
return Decision.permit(new Obligation("DoNotDownload"),usageRights);
}
Check for death dates of all contributors
if (DateChecks.allAuthorsDeadLongerThan(attributes.getMetadata(),authorDeathDateLimit)) {
return Decision.permit(usageRights);
}

Decisions
Input: Identifier, Metadata, Location, Authorization
End result of a policy decisions:
- PERMIT
- DENIED
- NOT APPLICABLE
additional attributes:
- obligations: things the endpoint has to enforce
- advices: things the endpoint might need to improve UX
Ex: PERMIT (obligation:”DoNotDownload”, advice:”OnlyInReadingRoom”)

Diagram by David Brossard under a CC-BY 3.0 license
Enforce
Decide
Administer Metadata
Context

Enforce
Decide
Administer
Metadata
Context
Image server OAI-PMHObject store
PDP webservice
RDBMS Metadata HTTP Request
Admin/Reporting
GUI
Policy Scripts
Groovy
Authorization
LDAP

Architecture: XACML (sort of)
- Attribute Based Access Control (ABAC)
- Follows XACML reference architecture
- … but not the language (cumbersome, slow and restricted)

Technology
- Write the policies in an embedded scripting language (Groovy)
- Fast (in comparison to XACML language implementations)
- Able to be adopted/managed outside of core development team
- still: reuse of existing development toolchain
- Automated testing !
- Deployed as central REST service
- Serves multiple applications

Reporting and testing
Collections Policies Digital Objects Policies Metadata

Limitations
- Search filtering on access: combination with dynamic decisions
- Which objects am I allowed to use ?
- Export of access information to other systems (e.g. WorldCat)
Possible mitigations
- Compromises on dynamic decisions (short term)
- Move from slow ETL to event-based architectures (longer term)

Current status & results
- Stepwise in production since Mid 2016
- New objects are becoming available
- Copyright claims are easier to handle
- Clearer insight into current status of collection
- Better insight into needs for partnership contracts
- Impulses for better metadata storage/access infrastructure
175M requests per month
+/- 6 million a day
60+ million pages
under control by
access management

About
- Managing digital collections with multiple licenses and access
policies
- Technical choices that fit our organisational needs
Not about
- DRM and copy protection
- Usage of closed proprietary systems

Motivation
- As a public service organisation we want: access as far as
possible
- Limit of possibilities
- Licenses
- Contractual obligations
- Governmental and organisational policies
- Copyright status
- A simple yes or no is not always enough; we need
- a clear guideline for the user: what can I do with it and how do I get
access ?
- automation of management: we want to be able to scale and still be
compliant

Crossing the domains: communication
- Define your terms: Collection, policy, decision … make sure to
communicate them clearly
- Make sure contracts and managerial decisions can be translated to
the technical reality.
- Offer protection and guarantee options for future contracts
- Make compliance easier through monitoring + reporting
- Use of examples + flow diagrams

ONIX-PL: machine-readable contracts
Machine-readable, but not actionable

Our problems
- Multiple applications give access to collections
- ideally centralised decision making and reporting
- Decisions depend on context: user, location, time
- Flexible to allow for individual interventions
- Clearer insight necessary why things are hidden away

Click to adjust
• Subject 1
• Subject 2
• Subject 3

Click to adjust
• Subject 1
• Subject 2

Automating rights decision elag 2017

Recommended

Recommended

More Related Content

Similar to Automating rights decision elag 2017

Similar to Automating rights decision elag 2017 (20)

More from KBNLResearch

More from KBNLResearch (7)

Recently uploaded

Recently uploaded (20)

Automating rights decision elag 2017