Developed by Paul Walk and colleagues at EDINA, RIOXX is designed to provide a mechanism to help UK institutional repositories comply with the national funders' requirements for reporting on open access outputs from public funding. RIOXX focuses on applying consistency to the metadata fields used to record research funder and project/grant identifiers and is designed to support the consistent tracking of open-access research publications across scholarly systems.
The development of RIOXX has emphasised the importance of implementation, and has therefore involved software developers closely, even borrowing some software development practice such as continuous, automated testing of deployed 'feeds' of RIOXX records. This webinar will briefly examine some of these novel approaches to metadata profile development.
4. “Application profiles consist of data
elements drawn from one or more
namespace schemas combined together
by implementors and optimised for a
particular local application.”
Heery & Patel: Application Profiles: Mixing and Matching Metadata Schemas
7. why build RIOXX?
• new policies from RCUK and HEFCE mandate that any journal article funded
by research grants be made publicly accessible in a repository
• these policies require that universities make metadata about such papers
easily discoverable
• the available metadata formats were inadequate
• OAI-DC was not rich enough
• OpenAIRE was better but demanded project IDs be encoded in particular
syntax not compatible with project IDs from UK Research Councils
• OpenAIRE syntax
• info:eu-
repo/grantAgreement/Funder/FundingProgram/ProjectID/[Jurisdiction]
/[ProjectName]/[ProjectAcronym]
• RCUK syntax:
• OpaqueProjectID/version
8. • an application profile using properties from 4 namespaces:
• 11 properties from Dublin Core (dc and dcterms)
• 2 properties from NISO Open Access Metadata and Indicators
• 8 from a new namespace - ‘rioxxterms’
• constraints imposed through several controlled vocabularies
• it has one purpose: to provide a mechanism to help institutional repositories
in the UK comply with the RCUK policy on open access.
• it is not designed to provide general interoperability!!
• born at UKOLN, developed by EDINA and Chygrove Ltd., supported by
Research Councils UK (RCUK) & Higher Education Funding Councils of
England (HEFCE) and funded (initially) by Jisc
• Version 2.0 released in January 2015
9. components of RIOXX
• a metadata ‘application profile’
• technical documentation
• an XSD schema to facilitate metadata validation
• mapping to OpenAIRE 3
• a set of guidelines for systems implementation (with a focus on institutional
repositories)
• implementation monitoring and testing framework
• a supporting website (http://www.rioxx.net)
• (+ independent development of software plugins etc. to support RIOXX
implementation)
10. particular concerns
• how to represent the funder
• how to represent the project/grant
• implementing recommendations from the V4OA process:
• controlled vocabularies for rioxxterms:version, rioxxterms:apc
• use of NISO’s Open Access Metadata and Indicators (license_ref and
free_to_read)
• how to represent the persistent identifier of the item described
• provisions of identifier(s) pointing to related dataset(s)
• how to represent the rights of use of the item described
11. some specific properties
• dc:identifier
• dc:relation & rioxxterms:version_of_record
• dcterms:dateAccepted
• rioxxterms:author & rioxxterms:contributor
• rioxxterms:project
• license_ref
12. dc:identifier
• identifies the open access item being described by the RIOXX metadata
record.
• regardless of where it is located
• recommended to identify the resource itself, not a ‘splash page’
• this will not always be possible or desirable
• whatever it identifies, it MUST be an HTTP URI
• Example:
<dc:identifier>
http://oro.open.ac.uk/2/1/LIBARTVICEprints.pdf
</dc:identifier>
13. dc:relation & rioxxterms:version_of_record
• rioxxterms:version_of_record
• an HTTP URI which is a persistent identifier for the published version of the
resource
• will often (normally?) be a DOI
• dc:relation
• optional property containing an HTTP URI identifying related resources (e.g.
research data sets, software source code etc.)
15. rioxxterms:author & rioxxterms:contributor
• both of these accept an optional ‘ID’ attribute
• this MUST be an HTTP URI
• use of ORCID is strongly recommended
• all authors should be represented as individual rioxxterms:author properties
• the ‘first named author’ can be indicated with another optional attribute called,
er…, ‘first-named-author’
• rioxxterms:contributor is for other parties that are not authors but are credited
with contributing in some way to the publication
• Example:
<rioxxterms:author id="http://orcid.org/0000-0002-
1395-3092">
Lawson, Gerald
</rioxxterms:author>
16. rioxxterms:project
• this expresses funder and project_id in one, slightly more complex, property
• the use of global IDs, e.g. International Standard Name Identifier (ISNI) for
funding organisations is recommended
• Example:
<rioxxterms:project
funder_name="Engineering and Physical Sciences
Research Council"
funder_id="http://isni.org/isni/0000000403948681"
>
EP/K023195/1
</rioxxterms:project>
17. license_ref
• adopted from NISO’s Open Access Metadata and Indicators
• takes an HTTP URI and a start date
• the URI should identify a license
• there is work under way to create a ‘white list’ of acceptable licenses
• embargoes can be expressed this way, with a license identified to ‘take effect’
at some (possibly) future date
• Example:
<ali:license_ref start_date=“2015-02-17”>
http://creativecommons.org/licenses/by/4.0
</ali:license_ref>
20. principles
• purpose driven
• designed to meet a singe, focussed use-case
• solve one problem well, avoid ‘feature creep’
• focussed on implementation
• has to be relatively easy to implement
• ‘shallow’ structure
• the simplest thing that can possibly work
• open development
• public consultation
• tested openly
• rapid development
• short iterations
22. Manifesto for Agile Software Development
We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:
Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
That is, while there is value in the items on
the right, we value the items on the left more.
http://agilemanifesto.org
23. applying these principles to RIOXX development
• Individuals and interactions over processes and tools
• we concentrated on what worked - & what made sense to the user/sponsor
• Working software over comprehensive documentation
• an application profile is fundamentally a set of documentation!
• however, RIOXX is implemented in software
• Customer collaboration over contract negotiation
• we worked as closely with users as possible, and worked very openly
• Responding to change over following a plan
• iterative - we developed RIOXX in short development cycles punctuated by
review
24. transferable Agile techniques
• iterative design and development with users
• high-bandwidth interaction with users
• short iterations or ‘sprints’
• documentation can be made this way just as with code
• starting with the ‘minimum viable product’ (MVP)
• continuous testing during development (and after!)
• testing aids development and understanding
25. continuous testing
• extremely important
• should be mechanistic, or semi-automated, wherever possible
• so that it actually gets done!
• should deliver immediate and useful feedback
• not just the usual XML schema validation - this is often important, but it is not
enough
29. summary
• RIOXX has been created to help universities address open-access reporting
requirements from the UK Research & Funding Councils
• it has been developed using agile approaches and techniques borrowed from
software-developers
• it has been implemented in 26 known repositories since January 2015, and has
been used to some extent in two international aggregation initiatives:
• OneRepo:
• http://onerepo.net/onerepo-single-page.pdf
• SHARE
• https://github.com/CenterForOpenScience/SHARE
• It’s adoption is growing steadily :-)
30. Paul Walk
Head of Technology Strategy and Planning, EDINA
p.walk@ed.ac.uk
@paulwalk
thanks for listening!
the RIOXX metadata application profile is maintained &
supported by EDINA:
http://www.rioxx.net
Editor's Notes
Good morning (or afternoon, or evening, or whatever it is where you are)!
I’d like to thank FAO for this chance to talk to you today, particularly Imma Subirats and Karna Wegner who have been very helpful
today I will:
but first, a quick introduction to the idea of the metadata application profile
The development of an AP can be quite different to the development of a standard, mainly because of its scale, but also because it can involve a different range of people
application profiles exist on a continuum - from quite generalised through to more specific
(explain the diagram)
so, now I’ll talk a little about the RIOXX application profile
open access policies in UK relating to public-grant funded research
RIOXX is a classic application profile in that it adds constraints to existing terms and adds new ones as necessary
this is what we have produced - will say more about the testing framework later
explain V4OA
we’ll have a closer look at some of the properties
the decision to require an HTTP URI gives us two advantages:
we don’t need to specify the schema beyond this requirement
we can identify the schema from the URI - e.g. DOI
acceptance date represents a more clearly identifiable ‘business event’
working closely with the OpenAIRE team, we have provided a mapping between RIOXX 2.0 and OpenAIRE 3.0
so, now I’d like to talk a little about the way in which we developed RIOXX
implementation is key. Previous efforts in this space have not been implemented
there are some practices which have emerged from software development over recent years which we might apply to application profile development
‘Agile’ has become an overloaded term, but it’s important to remember that it started somewhere with some principles:
Agile Manifesto couches itself in a series of ‘preferences’ - the phrases in bold towards the left
worth noting this is now 14 years old!
be Agile. Agile development is not a good fit necessarily for standards development, but it has something to offer the development of application profiles, especially if they are very focussed and tightly coupled to a specific problem
Agile techniques - transferrable to AP development
we don’t have time today to look at all of these, but I want to examine one of them because I don’t think it has been done much elsewhere - the idea of continuous testing.
the subject of testing, and how we can apply it to AP development is worth a whole session on its own.
using the example of RIOXX
this is testing sample data from all known RIOXX implementations on a regular basis - and it’s completely automated
doing this openly on the web creates incentives for people to fix things!!
a detailed report is generated for each of the systems tested
this shows both the system developers and the end-users exactly which aspects of the AP have been invalidated
even shows them the raw data where these issues have occured
testing for quality of implementation
characterising the usage