Institutional Identifiers internally and throughout the supply chain

ALPSP Seminar: Data the universe and everything
22nd January 2014
Laura Cox

Contents
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

Healthy data
Why now?
Institutional Identifiers – what and why?
Common data problems
Data governance
Data integration (linking your data together)
Institutional Identifiers in the supply chain
Institutional Identifiers – which?
ISNI IDs
What you can do now?

Why is healthy data important?
Good quality, healthy data can be
utilized to gain insight into customers,
business relationships and to support
strategic planning, decision making,
and ongoing business operations.
But when it’s unhealthy….

Poor data has real consequences
 Hard to get a true picture of relationships with institutions
 Lack of quality author (and affiliation) data

 Inability to see overlap between authors, members and

customers
 Inaccurate holdings and revenue reports
 Protracted time and effort taken to analyse data
Everything becomes more difficult, and less accurate

Healthy records are:
 Complete
 Accurate
 Free of duplicates
 Current
 Consistent

 Conform to standards

Unique Identifiers
What are they? How can they help?
 Numeric or alpha-numeric designations which are associated with







a single entity
Entities can be an institution, person, or piece of content
Enable the disambiguation of each entity
Proper understanding of the customer, author, reader or
institution
Proper identification of content object, article, product, or
package
Can be used internally or in conjunction with external partners

Why we should worry about data now?
 Number of researchers increasing by 3% per annum*
 Number of articles increasing by 3% per annum, current

output is 1.8-1.9 million per year*
 Number of journals increasing by 3.5% per annum*
 Growth in China has been in double digits for over 15 years*
 Increased demand for anytime/anywhere access
 Library budgets are frozen or being cut, less money for more
content means we have to work smarter
* Ware, M and Mabe, M, The STM Report, 2012

What are Institutional Identifiers for?
Disambiguating:
 UCL:
 University College London (UK)
 Université Catholique de Louvain
(Belgium)
 Universidad Cristiana
Latinoamericana (Ecuador)
 University College Lillebælt
(Denmark)
 Centro Universitario Celso Lisboa
(Brazil)
 Union County Library (USA)

 NPL:
 National Physical Laboratory (UK)
 National Physical Laboratory
(India)
 York University
 University of York (UK)
 York University (Canada)
 Northeastern University:
 Northeastern University
(Boston, USA)
 Northeastern University
(Shenyang, China)

What are Institutional Identifiers for?
Consolidating:

Hierarchy View:

 University of Oxford

 University of Northampton

 Univ. Oxford

 Northampton Business School

 Oxford University

 School of Education

 Library, Oxford Univ.
 Radcliffe Science Library

 School of Health

 School of Science and Technology


 Bodleian Library



 Bodleian, Oxford



 Oxford, University of



Division of Computing
Division of Engineering
Environmental & Geographical Sciences
Institute for Creative Leather
Technologies

 School of Social Sciences
 School of The Arts

Use cases – the why
Identifiers enforce uniqueness
 Disambiguate institutional records
 Eradicate duplication of data
 Ensure correct delivery, entitlements and access rights
 Better understand your customer base and relationships with

institutions
 Improve “trust” in data
 Map institutions into their hierarchy

Common data problems
 Most publishers have problems with data:
 Multiple accounts for each customer

 Multiple internal IT systems for different purposes
 Data entry without standard names or ID numbers
 Lack of hierarchy information
 No formal manner to track customers across systems

The challenge: Data Sources
 Multiple data sources – ‘system’ data silos
 Multiple locations – ‘geographic’ data silos

 Data entered by different people for different purposes
 Data from third parties in the supply chain
 Data from bought-in sources

The challenge: Data Sources
Typical publisher systems:

Data can be entered by:

 Financial system

 Organisation staff

 CRM/Sales database

 Authors

 Authentication system

 Society members

 Fulfilment

 Agents in the supply chain

 Usage statistics
 Submissions system
 Author database
 Document Storage (contracts and

licences)
 …..

 3rd party organisations
 …..

Implementing a data governance plan
 Important considerations:
 What data is held, where it is held and how it is accessed?

 How can the data be used to further benefit different





departments, processes or activities?
Could the use of current or planned systems be expanded for
further benefit?
Is data highly accurate and consolidated or in need of cleansing?
Are there applications of data that have not been explored?
What requirements are there for additional data?

Improve data capture
 If you can – use web forms
 Implement required fields

 Data validation – at a minimum use naming conventions
 Address validation – postcode lookup
 Institution validation – institution lookup

 Web form consistency across systems
 Avoid free-text fields
 Make institutional identifiers a requirement

Implementing Institutional IDs
Turn your records from this…..

…..into this.

Data integration
CRM

 Using Institutional Identifiers

to link internal systems:

Electronic
document
storage

Financial
System

 Prevent duplicate account








creation
Break down silos
Keep data up-to-date and
systems synchronised
Enable staff to use data more
effectively
Simplify data transmission
Improve overall data quality

Authentication

Institutional
Identifiers

Membership
system

Usage
statistics

Author
Database
Fulfilment
system

Linking author and institution IDs
 When authors and their affiliations are linked

correctly, publishers gain:
 Market intelligence about authors and institutions
 Author and subscriber information mapped together
 Knowledge of where research funding is concentrated
 Reduction in time taken calculating open access charges (APCs)

 Institutions gain information about their overall research

output
 Funders gain information about where authors reside and
publish

The scholarly supply chain
 Purpose:
 Serving the author and reader

 Disseminate content as widely as possible
 Ensure content is easily discoverable
 Provide information in an efficient and trouble-free manner

regardless of:




Content type
User requirements
Desired methods of access

The supply chain (simple version)
Author

Funders
Submission
and Peer
Review
System

End User

Discovery
Service

Consortium
Consortium

Data
Providers and
Systems
(multiple)

Publisher

Online Host
or
Technology
Partner

Library

Fulfilment
House or
System

Subscription
Agent or
Sales Agent

Societies

Supply-chain spaghetti
Author

Funders
Submission
and Peer
Review
System

End User

Discovery
Service

Consortium
Consortium

Data
Providers and
Systems
(multiple)

Publisher

Online Host
or
Technology
Partner

Library

Fulfilment
House or
System

Subscription
Agent or
Sales Agent

Societies

What could possibly go wrong?
 Records are unconnected through the supply chain, links fail:
 Between entities
 Between internal systems
 Between external systems









Renewals are mishandled
Journal transfers are mishandled
Access and authentication is mishandled
Authors and individuals are not linked to their institution
Open access fees have to be checked manually
Authors are not linked to their research
Funders are not linked to the research they fund

Where stronger links are needed
 Finding a path to using standardized data, which:
 Eradicates duplicate records within and between systems

 Enables seamless communication between organizations
 Smoothes the supply chain, removing ambiguity or lack of

information for any party
 Enables higher quality of service
 Increases understanding of customer base and enables better
decision making for everyone involved

…becomes organised, with accurate
data and information flow

Consortium

The vision
In an ideal world we would be able to utilise, provide and
obtain data that is accurate, complete and easily joined
together:
 Reducing problems and errors
 Providing better overall service
 Creating seamless processes
 Providing a better understanding of customers and our own
businesses

External linking – in the supply chain
 Using Identifiers will:
 Ensure accuracy of information

 Speed up data transactions
 Reduce queries
 Reduce costs
 Open data up to new uses
 Ensures that authors receive credit for the work they produce
 Ensures that end users receive uninterrupted access to the

content they need

A truly linked supply chain

Identifiers

Institutional Identifiers – which ones?
 JISC and CASRAI (Consortia Advancing Standards in Research

Administration Information) report on Organisation IDs:
http://repository.jisc.ac.uk/5381/1/CC549D0011.0_org_ID_landscape_study.pdf
 Examined the landscape of organisational identifiers in the UK
and identified 23 different IDs
 Based on interviews with key individuals
 Lots of detail on use cases for publishing, funders, and
institutions

CASRAI report
 Disambiguating organisational information from multiple

sources typically described as “a nightmare”
 Benefits from effective unique identifiers are truly realised
when data is shared
 Key aspects of identifiers that support the widest range of
uses:
 Governance
 Trust
 Transparency
 Temporal
 Appropriate metadata

●
●
●

●

●
●
●

●
●
●

●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●

●

●

●

●

●
●

●
●

●
●
●

●
●
●

●
●
●
●
●
●
●
●
●
●

Publishers

●
●

Funders

Companies

●

●

●
●

Curated

Regulated

Mainly
used for
linking

●
●
●

●

HEIs

Global
Global
Global
Global
Global
Global
Global
Global
Global
UK
UK
UK
UK
UK
UK
UK
UK
UK
UK
UK
UK
England
EU

Historic

Please note that ORCID had not
released the institutional
affiliation at the point at which
this report was published.

Identifier
Name
Dun & Bradstreet
FundRef
ISNI
ORCID
Ringgold's Identify
MACE & UK Federation
VIAF
Research Analytics
Companies House
Gateway to Research
Government bodies
HESA
IDBR
Janet
Je-S/CDR OrgID
Research Fish
RCUK
ROS
UCAS
UKPRN
HEFCE
PIC

Coverage

Identifiers identified

●
●
●
●

●
●

●

●
●

●
●
●

●

●

●
●
●

●

●
●

●

●

●
●

●

●
●
●
●

●

●

●

●
●

●
●

●

●

Publishers

●
●

Funders

Companies

●

HEIs

●

●

●
●

Curated

Regulated

●
●
●

●

Historic

Global
Global
Global
Global
Global
Global
Global
Global
Global

Mainly
used for
linking

Identifier
Name
Dun & Bradstreet
FundRef
ISNI
ORCID
Ringgold's Identify
MACE & UK Federation
VIAF
Research Analytics

Coverage

Global Identifiers
●
●
●
●

●
●

●

●
●

●

●

ISNI
ISNI Number

ISNI Number

Party ID 1

Party ID 2

Proprietary
Information and/or
Metadata

Proprietary
Information and/or
Metadata

 ISO Standard 27729
 ISNI is designed to be a

“bridge identifier”
 Covers any type of entity

ISNI IDs
 Ringgold is an ISNI Registration Agency for institutions
 Unique ISNI Institutional ID number can connect any data

and any systems
 ISNI IDs should be used by publishers and across the
scholarly supply chain to:
 Link systems using the ID numbers
 Link data sets which contain proprietary metadata
 Provide clean data transmission

ISNI spans all industries, market segments, and regions
Academia
Medical
Corporate
Government
Not-for-profit
Public libraries
Schools

Publishers
Funding bodies
Intermediaries
Distributors

http://isni.org/

What can YOU do now?
 Engage with the problems you have with data
 Find some resources – think about time not just money

 Consider how data could better serve your organisation
 Appoint a data champion and document everything
 Generate a data governance policy

 Create some basic rules for data entry
 Utilise universal identifiers to clean and link your data
 Work with suppliers and customers to utilise institutional

identifiers to strengthen the supply chain

Laura Cox
President, Chief Financial and Operating Officer
Ringgold Inc.
laura.cox@ringgold.com

Institutional Identifiers internally and throughout the supply chain

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Institutional Identifiers internally and throughout the supply chain

Similar to Institutional Identifiers internally and throughout the supply chain (20)

More from Ringgold Inc

More from Ringgold Inc (6)

Recently uploaded

Recently uploaded (20)

Institutional Identifiers internally and throughout the supply chain