2. Registry
Outline
• reference data and role of the registry
• environment registry service
– demonstration
– information model
– services
– status
• processes and usage
– data flows – preparation and access
– governance
– implications
• Q & A
2
3. Registry
Goals of the webinar
• understand the registry service
– what it is
– what it’s for
• stimulate thinking on
– how to make use of it
– processes and governance
• feedback
– future training needs (technical details course)
– service requirements
3
4. Registry
Motivation
Multiple forces
– deliver government policy
• ITC Strategy
• Digital Policy
• Open Data
• Open Standards
– INSPIRE
– Defra move to shared services (ICT Futures,
Knowledge Strategy and Open Data Strategy)
"Government IT must be open - open to the people
and organisations that use our services and open
to any provider, regardless of their size.”
4
6. Registry
EG13987 P816L H 0.x u35 2012
Reference data
Grafham water phosphate level High 0.x mg/l 2012
• standardized terms to identify things in data
• normally represented by coded identifiers
• key to meaning of the dataentities, places, objects
substances, determinands
units
classifications, codes
assessment methodology
sampling methodology
6
7. Registry
Reference data identifiers
• to enable data reuse and access we need
reference data that is:
– independent of a particular data system
• global identifiers
– stable
• persistent identifiers
– interpretable
• identifiers you can look up (“resolvable”)
7
8. Registry
Reference data identifiers
Open standards process
• persistent resolvable
identifiers challenge
• adopt HTTP 1.1 and URLs for this
– board resolution 24 Sep 2013
• obvious fit
– global (DNS)
– resolvable (http)
8
9. Registry
Reference data identifiers
• => requirement to create and maintain URIs
for identifiers in reference data
• challenges
– how to share an authoritative namespace?
– what do they deference to?
– managing authoritative lists of terms?
9
10. Registry
UKGovLD Registry
Registry – tool to manage URIs for reference data
Services:
– manage controlled lists of identifiers as URIs
– store core data so the identifiers resolve
– namespace management
– other: validation, discovery, version and history
Design and open source implementation
Defra instance at:
http://environment.data.gov.uk/registry/
10
19. Registry
Machine accessible
• each collection of entries or individual entries
has a URI
• access through a browser you see a web page
• a machine can request specific format like
JSON directly (content negotiation)
19
22. Registry
Information model
• Registers can contain registers or simple items
root register
top level of definitions
a themed collection
http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/
http://environment.data.gov.uk/registry/def/catchment-planning/
http://environment.data.gov.uk/registry/def/
http://environment.data.gov.uk/registry/
a code list
codes
• http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/UK01
• http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/UK02
• ...
22
23. Registry
Information model
Register - a controlled list of items
– manager and governance policy
– lifecycle and status for items in the register
23
24. Registry
Information model
Register item
– a definition of something
• concept, organization, geographic area, substance ...
– represented as a set of property values
– type and label required
– values can be simple (strings, numbers) or URIs
http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/UK05
type http://location.data.gov.uk/def/am/wfd/RiverBasinDistrict
label Anglian
notation UK05
24
25. Registry
Information model
• Register item
– no constraints on what properties you use to
describe the item (“schema-less”)
– open ended, can add richer descriptions later
– gives a lot of flexibility for what you can register
– tame this by adopting a few standard patterns
• but can add more as needs change
25
26. Registry
Information model
Standard patterns
• collections of codes
– SKOS (Simple Knowledge Organization Scheme)
– Concepts grouped in to ConceptSchemes
• organizations
– ORG (Organization Ontology)
26
27. Registry
Aside on RDF and linked data
• there’s a standard for how to represent such
descriptions of things identified by URI
– Resource Description Framework (RDF)
• the registry design and implementation is built
on this standards stack
• the standard patterns are RDF vocabularies
• but don’t need to use RDF in your information
systems in order to use the registry
27
28. Registry
Information model
Linking
– value of a property can be a URI
• any URI – same register, other register, external
– use for hierarchical structure within a register
• concept schemes with broader/narrower links
• organizational structure with sub-organizational links
28
30. Registry
Information model
Linking
– value of a property can be a URI
• any URI – same register, other register, external
– use for hierarchical structure within a register
• concept schemes with broader/narrower links
• organizational structure with sub-organizational links
– use for cross matching between code lists
• exact match within MMO experimental data
30
32. Registry
Information model
Linking
– value of a property can be a URI
• any URI – same register, other register, external
– use for hierarchical structure within a register
• concept schemes with broader/narrower links
• organizational structure with sub-organizational links
– use for cross matching between code lists
• exact match within MMO experimental data
– use to relate registered URIs with other URIs
• same as links from organizations to organogram IDs
32
33. Registry
Information model
External items
– can register URIs outside the registry service
– register holds
• definitive list of items
• metadata about the registered items (status etc)
• copy of minimal description information
– up to the URI owner to maintain the URI
Managed items
– URI is within the register namespace
– maintenance done within the registry service
33
35. Registry
Information model
Filters
– category
• classification of subject that the register is about (e.g. Water)
– entity
• the type of thing in the register (e.g. Regions and Habitats)
– owner
• the organization which owns and manages the register
Extensible
– this is just metadata associated with the registers
– could extend the category schemes or add others
35
36. Registry
Information model
• Summary
– Register
• controlled list
• arranged in a hierarchy like folders in a file system
• can be annotated and classified to help with navigation
– Item
• entry in register
• can be external to the registry or internal (managed)
• has a URI and extensible set of descriptive properties
• optional standard patterns for the descriptions
• properties enable links between items
36
38. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
– namespace management
– validation
– discovery
– version and history management
38
39. Registry
Registry services
An API for everything
– REST API for each of the services
– user interface is layered on top
– can build external tools which provide other
interfaces but work via the API
– interface itself is template-driven and easy to
modify
39
40. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
• create register
• register item(s)
• update items
• change status
40
41. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
• return as RDF, JSON, CSV (TBD)
• view as web page in browser
• control over how lists (registers) are returned
– see metadata as well as the items
– filter by status
– page through long lists
41
42. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
– namespace management
• requests to parts of URI space can be forwarded to
other services
• some support for federation
42
43. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
– namespace management
– validation
• test of a set of URIs are valid
43
44. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
– namespace management
– validation
– discovery
• supports text search
• user interface navigation support
44
45. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
– namespace management
– validation
– discovery
– version and history management
• stores history of item versions
• versioned URIs
• see item or register at point in time
45
46. Registry
Registry services
• Outline of services
– manage controlled lists of identifiers as URIs
– serve data for the registers and managed items
– namespace management
– validation
– discovery
– version and history management
46
47. Registry
Status
• open design and open source implementation
– managed by UKGovLD
• proof of concept deployment
– proved principle, and running stably for 8 months
• pilot deployment for environment.data.gov.uk
– supported for one year
– alpha but robust so far
– security model for update based on OpenID
– intention to continue service if pilot successful
– may require enhancements e.g. server replication
47
49. Registry
Processes and usage
• how do you use the registry?
• how do you get data into it?
• what data should go into it?
• how will that be managed and governed?
49
50. Registry
Using the reference data
Manual consultation
– e.g. developer looking up meaning of term in data
Registry
service
lookup URI via
browser
50
51. Registry
Using the reference data
Use code list in IT application
– e.g. data entry dialog, export mapping ...
Registry
service
IT
Application
RDF
[CSV]
JSON
local
format
export code
list via web API
or manual
download
support
other
formats?
use
directly
map to
application
specific format
51
52. Registry
Using the reference data
• Publish data using the references
– use the URIs instead of free text or opaque codes
– can use URIs in CSV or JSON, doesn’t have to be
RDF linked data
52
53. Registry
Using the reference data
Site Det Measurement
1 A 0.1
1 B 50
2 A 0.5
2 B 10
Site Determinad Value
A X 100
B X 10
C X 20
D X 100
Site Det Measurement
1 http://… 10
1 http://… 50
2 http://… 50
2 http://… 10
Site Determinad Value
A http://… 100
B http://… 10
C http://… 20
D http://… 100
Site Det Measurement
1 http://… 10
1 http://… 50
2 http://… 50
2 http://… 10
A http://… 100
B http://… 10
C http://… 20
D http://… 100
53
55. Registry
Publishing reference data
• which data?
– reference data that enables data reuse or sharing
– between organizations or as part of open data
– connective reference data
– judgement required
55
57. Registry
Processes and usage
• Registry is just tool
– manage a set of global, persistent identifiers
– to enable data to be reused and integrated
• whether across organizations or as open data
– but up to you
• which reference data should be managed this way
• how the maintenance process should work in practice
• when to map data to the shared identifiers
– it’s there to reduce the cost of such management
• to gain the benefits of reusable data
• not to add additional processes for the sake of it
57
58. Registry
Next steps
Technical “how to” training planned
Coverage:
• preparing data for registration
• registration and managing entries
• accessing data
Suitable for
• potential registry administrators or publishers
58
59. Registry
Links
• Design and API details
https://github.com/UKGovLD/ukl-registry-poc/wiki
• Alpha site
http://environment.data.gov.uk/registry
59
64. Registry
Convenient views
• full RegisterItem/Register structure complex
• versioning makes that a lot worse
//registry
Register
VersionedThing
//registry:1
Register
Version
//registry/_reg
RegisterItem
VersionedThing
//registry/_reg:1
RegisterItem
Version
//registry/reg
Register
VersionedThing
//registry/reg/_foo
RegisterItem
VersionedThing
//registry/reg/foo
(entity)//registry/reg:1
Register
Version
//registry/reg/_foo:1
RegisterItem
Version
dct:versionOf
dct:versionOf dct:versionOf
dct:versionOf
reg:register
reg:register
reg:definition
reg:definition
//registry/_reg:2
RegisterItem
Version
//registry:2
Register
Version
//registry/reg:2
Register
Version
//registry/reg/_foo:2
RegisterItem
Version
64
66. Registry
Aside on details
• internally a register item is more complex
– metadata about status, when submitted etc
– the description of the thing “entity”
Register
RegisterItem
label
description
status
submitter
item class
date submitted etc
...
register
register
entitydefinition
EntityReference
entitydefinition
EntityReference
RegisterItem
label
description
status
submitter
item class
date submitted etc
66
67. Registry
Structure – information model
• managed entity
– URL in registry namespace
– registry holds master copy of the entity data
Register http://.../def/catchment-planning/RiverBasinDistrict/
Register Item http://.../def/catchment-planning/RiverBasinDistrict/_UK05
Entity http://.../def/catchment-planning/RiverBasinDistrict/UK05
67
68. Registry
Structure – information model
• referenced entity
– URL external to registry (well, register)
– registry holds minimal copy of data
Register http://.../def/catchment-planning/RiverBasinDistrict/
Register Item http://.../def/catchment-planning/RiverBasinDistrict/_UK05
Entity http://agency.gov.uk/RDB/Anglia
68
69. Registry
Information model
complicated by:
– item v. entity
– versioning
reg:Register entity
reg:register
reg:definition
reg:entity
reg:RegisterItem reg:EntityReferencereg:RegisterItem reg:EntityReference
69
70. Registry
Information model
• default linked data view of Register is simplified
• configurable
– alternative membership property or inverse property
– so can make a register look like a skos:Collection,
skos:ConceptScheme, owl:Ontology ...
– also acts as a LDP container
• but can request full view (?_view=withMetadata)
reg:Register entity
reg:register
reg:definition
reg:entity
reg:RegisterItem reg:EntityReferencereg:RegisterItem reg:EntityReference
induced membership
relation default is
rdfs:member
container view
full view
70
72. Registry Federation, delegation and namespaces
Case 1: External entities
– identifier published in different namespace
– want to include it in authoritative list
Solution:
– just register as a referenced entity
– already seen this
– authoritative because it’s on the list
– can record properties of the entity, and maintain
history
– no namespace management involved
72
74. Registry
Case 2: Namespace allocation
– want someone else to serve part of the registry
namespace
– might be a single item or a complete register sub tree
– e.g. allocating namespace in location.data.gov.uk for
serving INSPIRE spatial object identifiers
Solution:
– reg:NamespaceForward
– can be a redirect (30X) or proxy (200)
– no constraints on whether target acts like a Registry
– target ought to serve linked data with URIs in the right
namespace, but not required
Federation, delegation and namespaces
74
76. Registry Federation, delegation and namespaces
Case 3: Federated register
– want someone else to run part of the registry
infrastructure but act like one big registry
– integrated search, validation etc
Solution:
– reg:FederatedRegister
– can be a redirect (30X) or proxy (200)
– target endpoint must comply with Registry API at
least for search, validation and entity lookup
76
78. Registry Federation, delegation and namespaces
Case 4: Delegating a register
– some one else to serve the list of contents of the
register
– but they only have triple store, not full registry
implementation
Solution:
– solution eg:DelegatedRegister
– specify SPARQL endpoint and triple
pattern to enumerate members
reg:DelegatedRegister
reg:delegationTarget [1]
reg:enumerationSubject [0..1]
reg:enumerationPredicate [0..1]
reg:enumerationObject [0..1]
78
80. Registry
Security model
• authentication
– OpenID (e.g. Google, Google profile)
• authorization
– permissions
• Register, Update, StatusUpdate, Force, Grant, GrantAdmin
• inherit down the tree
• e.g.: Register,Update:/example/local
– can grant to known user or anyone authenticated
– bundled into roles
• Maintainer – Update, Grant
• Manager – Register, StatusUpdate, Update, Grant
• Authorized – Register, Update, StatusUpdate - for experimental
areas
• Administrator - anything
80
Editor's Notes
Recap motivation which leads to the registry. Most people here already know all this but sometimes recapping the basics can reveal assumptions that aren’t as shared or obvious as you thought.
Recap motivation which leads to the registry. Most people here already know all this but sometimes recapping the basics can reveal assumptions that aren’t as shared or obvious as you thought.
Aside on URL v.s. URI?
Does this framing make sense for why we want persistent identifiers, why use URLs for these and that some tooling to help with that might be useful?
Live demo – slides are for backup and people unable to access the live meeting
Does this framing make sense for why we want persistent identifiers, why use URLs for these and that some tooling to help with that might be useful?
Talk about the information model because that’ll give some idea of the flexibility and what is and isn’t possible. Not the place to delve into full details. Might not have right balance here :)
Actually this is a simplification, do we want to go into that?
Technically the Item is the metadata and the thing itself is an entity. But don’t need to care about that distinction in normal use so avoiding it here.
There are technical details for how this is possible. Separation of RegisterItem and the Entity. Don’t go into here.
Does this framing make sense for why we want persistent identifiers, why use URLs for these and that some tooling to help with that might be useful?
Does this framing make sense for why we want persistent identifiers, why use URLs for these and that some tooling to help with that might be useful?
Authority to publish individual lists rests with business owners in each organization.
A Data Board made of network representatives can promote entries to network standards
Each organization can have a SRO able to administer their registers and delegate rights to individual publishers to maintain lists or collections under the authority of the corresponding business owners
Does this framing make sense for why we want persistent identifiers, why use URLs for these and that some tooling to help with that might be useful?