NISO Webinar Authority Control

NISO Webinar
Authority Control:
Are You Who We Say You Are?
Wednesday, February 11, 2015
Speakers:
Simeon Warner, Director of Repository Development, Cornell University Library
Laura Dawson, Product Manager, ProQuest
Thomas Hickey, Chief Scientist, OCLC
http://www.niso.org/news/events/2015/webinars/authority_control/

ORCID identifiers in research
workflows
Simeon Warner, Cornell University Library
with thanks to
Laure Haak, ORCID Executive Director and
Josh Brown, ORCID Regional Director, Europe
for slides and comments
NISO Webinar:
Authority Control: Are You Who We Say You Are?
February 11, 2015

“Use ORCID iDs in research
workflows to solve name
ambiguity and save everyone
a bunch of effort!”

ORCID background
• open - anyone can register, any organization with interest in
research and scholarly communications can join, iDs intended
for reuse, software open source
• non-profit - incorporated in USA, also ORCID EU
• community-driven - where community includes all sectors of
research process including publishers, funders, universities,
and the researchers themselves
two core functions:
1. a registry of unique identifiers and manage a record of
activities
2. APIs that support system-to-system communication and
authentication
see: http://orcid.org/content/initiative

ORCID status and adoption
A little over 2 years since launch, over 1.1M ids created,
over 190 members from all sectors and around the world.
-
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
June
July
Aug
Creator
Website
Trusted Party
2012 2013 2014
Publishing
25%
Universities
& Research
Orgs
45%
Funders
7%
Association
s
12%
Repositorie
s & Profile
Sys
11%
EMEA
35%
America
s
50%
AsiaPac
15%

National integrations and membership
http://openaccess.blogg.kb.se/2013/01/30/slutrapport-fran-projekt-forfattarindentifikatorer/
http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/researchinformation/orcid.aspx
http://orcid.org/blog/2014/09/03/denmark-adopts-orcid-consortium-approach-orcid-implementation
http://orcidpilot.jiscinvolve.org/wp/

ORCID Scope
ORCID = Open RESEARCHER AND CONTRIBUTOR Identifier
o Research activities
o Living people
o There are fewer researchers than the scope of people and
personas covered by ISNI or VIAF
CONTRIBUTOR -- ORCID intended to be used for the spectrum of
actors in the research process, not just authors, and records roles.
o Already supports roles like translator, principal investigator
o 2012 Harvard Workshop
http://projects.iq.harvard.edu/attribution_workshop/home
o 2014 Project CRediT Workshop
http://www.eventbrite.ca/e/project-credit-workshop-tickets-10314211083

Researcher driven
Creation methods:
• integrations dominate
• website second
• institutional creation
Researcher must be involved to create or activate the ORCID iD,
and can control the privacy settings and/or add information.
Recommend institutions use the trusted party creation method
rather then direct record creation. Need to connect with and
educate users anyway. Can pre-populate registration fields.
-
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
900,000
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
June
July
Aug
Creator
Website
Trusted Party
2012 2013 2014

Leveraging ISNI Organization IDs
ORCID uses Ringgold (an ISNI registrar) organization list to support
connection between individuals and education and employment
affiliations.

Leveraging FundRef identifiers
Funding agency list coordinated with FundRef
Auto-complete based
on FundRef data

Integration of ORCID iDs in research
workflows

Publication round trip
ORCID iDs are intended to be integrated into research and
publication workflows, and become embedded in the
metadata. ORCID iDs will thus be associated with new
works at the time of publication.
ORCID
record
Manuscript
Submission
ORCID
record
ORCID
record
Review
Publication
w DOI &
ORCID(s)
CrossRef
DOI assignment
Verified ORCID, update permission
Readers

Round trip process and implications
Publisher captures ORCID iD during manuscript submission
o Authenticated process, no mistyping, accurate
o User may grant permission to add works later
Publisher includes ORCID iD in metadata when minting DOI
o Will be available to support discovery
o Available in CrossRef search
Publisher/CrossRef writes metadata back to ORCID record
o Holder notified, can control visibility
o Saves effort updating record
o Information flow to other systems such as local profile (e.g.
I've linked my ORCID record with my VIVO profile)
Similar process for datasets, mediated by DataCite
ref: http://orcid.org/blog/2014/11/21/new-functionality-friday-auto-update-your-orcid-record

Funder workflow
• Use for applicants and reviewers
• Profile data reduces applicant/grantee form filling burden
• Improve reporting accuracy
• Pull publications, datasets and other works based on ORCID iD
ref: http://support.orcid.org/knowledgebase/articles/426596-orcid-funder-workflow

An ounce of ambiguity avoidance is worth a
pound of disambiguation
-- with apologies to Benjamin Franklin
• Workflow integration avoids name ambiguity at source
• Resulting data good for disambiguation of older data
• Resulting data good for compilation of authority records

“How much information should my
ORCID record have?”

Minimal record
Registration is really quick and
easy, 30 seconds perhaps
1. name
2. email
3. password
4. agree to privacy policy and
conditions
A minimal ORCID record that is
enough to get an iD and use it in
research workflows

Helpful ORCID record
Reasons to add a little more information:
1. Provide enough information so that someone who follows a
link to your record, or searches for you, can understand which
"John Smith" you are
o alternate names
o education and employment information
o a few works. Everyone likes to show off their best work …
o opens the door for disambiguation of existing data
1. Provide other identifiers so that ORCID can act as a
switchboard to connect your identities in different systems.
o local profile id (e.g. my VIVO id at Cornell)
o Scopus Author ID, Researcher ID, ISNI
o (Using the search and link wizards that connect to these
other systems is also the easiest way to add works.)

Expansive ORCID record
There are many import wizards which not only allow
o connection of an ORCID record to other identifiers
o also import of works, grants, etc..
o source is recorded and provides way to assess trust
ORCID registry has facilities for users to enter works themselves,
specify their roles, etc..
ORCID UI groups information about the same work from multiple
sources
o user may select preferred one to display
You may make your ORCID record a complete picture research
contributions if you choose. But a complete record isn't necessary
for ORCID to work.

ORCID is a hub
Other
Identifiers
Funders
Higher
Education
and
Employers
Professional
Associations
Repositories
Publishers
The ORCID identifier
connects researchers
with their works
(papers, grants,
datasets, and more),
organizations, and
other identifiers.
ORCID APIs enable data
exchange between
research information
systems.
DOI
DOI
ISBN
Thesis ID
ISNI
Researcher ID
Scopus Author ID
Internal identifiers
Member ID
Abstract ID
Member ID
Abstract ID
FundRef
GrantID

Hub identifier linking to other
identifiers and to profiles in
other systems

… and data in machine form too
$ curl –H “Accept: application/orcid+xml”
“http://pub.orcid.org/0000-0002-7970-7855/orcid-bio”
| grep external-id-url
<external-id-url>
http://isni.org/isni/0000000351311901
</external-id-url>
<external-id-url>
http://vivo.cornell.edu/individual/individual24416
</external-id-url>
<external-id-url>
http://www.researcherid.com/rid/E-2423-2011
</external-id-url>
<external-id-url>
http://www.scopus.com/inward/authorDetails.url?authorID=7103063073&p
artnerID=MN8TOARS
</external-id-url>

Pointers
Register at https://orcid.org/register if you haven’t already!
http://orcid.org/
• Research organizations: http://orcid.org/organizations/institutions
• Publishers: http://orcid.org/organizations/publishers
• Associations: http://orcid.org/organizations/associations
• Funders: http://orcid.org/organizations/funders
• Researchers: http://orcid.org/content/initiative
Membership http://orcid.org/about/membership
• Questions: membership@orcid.org
Blog http://orcid.org/category/newsletter/blog
Slides: http://www.slideshare.net/simeonwarner/orcid-identifiers-in-research-workflows

ISNI
Disambiguating Public Identities

What Is ISNI
• ISO Standard, published in 2012
• International Standard Name Identifier
• Numerical representation of a name
– 16 digits
– Assigned to public figures, contributors of content –
researchers, authors, musicians, actors, publishers,
research institutions – and subjects of that content (if
they are people or institutions).
– Example: 0000 0004 1029 5439

Who is ISNI
• Founding members
– IFRRO (International Federation of Reproduction
Rights Organizations)
– CISAC (International Confederation of Authors and
Composers Societies)
– SCAPR (Societies’ Council for the Collective
Management of Performers’ Rights)
– OCLC
– CENL (Conference of European National Librarians),
represented by the British Library and the National
Library of France
– ProQuest, represented by Bowker

Members
Quality Team
Board of Directors
ISNI Organizational Structure
Registration Agencies
Ongoing
assignments/
general public

How Does ISNI Registration Work
• Publisher submits names for assignment through a Registration
Agency
• RA works with the publisher to ensure the data feed is well-
formatted, and sends that feed to the Assignment Agency
• AA assigns as many ISNIs to the names in the feed as it can, using
complex algorithms and business rules that evolve with each feed
• AA returns a file of names with ISNIs attached to them
– This may not be the full file of names
– Ambiguous names are held for review by Quality Team
– QT assignments and other exceptions (assignments as a result
of improvements to the algorithm) are returned to RA quarterly
– Process is not instant. Assignment may be immediate if the
name and other information is unique, but frequently
assignments take a week or two.

Stage One
Customer
submits data to
Registration
Agency
Registration
Agency sends
file to
Assignment
Agency
Assignment
Agency assigns
as many ISNIs
to the names as
it can

Stage Two
Assignment
Agency sends
assigned file to
Registration
Agency
Registration
Agency sends
assigned file to
Customer
Customer reviews,
QAs, ingests

Stage Three
Assignment
Agency sends
updates on a
monthly basis
Registration
Agency disperses
files to appropriate
Customers
Customers ingest
updates

Display
• Only minimal metadata is displayed
• Not meant as a comprehensive profile
• ISNI is a tool for linking data sets, collocation, and
disambiguation
• Enhancements to the record can be made but not
required

Bridge identifier linking disparate data sets
ISNI links
41

Who is using ISNIs?
• Wikipedia/Wikidata
• VIAF
• Access Copyright
• Scholar Universe
• British Library
• JISC
• Musicbrainz
• Macmillan (Digital Science)
• Booknet Canada (piloting)
• Authors Guild (piloting)
• Books in Print ONIX 2.1 extracts (sent to Google, B&N,
Chegg and others)

How many names in the ISNI database?
• Over 8,000,000 assigned
• 10,112,931 provisional (awaiting a match from another
data set for corroboration)
• Your author names may well already have ISNIs.
http://www.isni.org/search.

Use Case: Research Institution

Use Case: Cross-Domain Linking

Data Quality
• Based on matching names to existing records in
database (over 17 million names)
• Strict criteria for assigning ISNIs to names
• Quality team oversight (manual edits)
– British Library
– National Library of France
– OCLC
50

Assignment Criteria
• If on the common surname list:
– Birth date
– Death date
– ISBN(s)
– Title(s)
– Co-authors or institutional affiliation
• If not on the common surname list
– Title(s)
– Birth date
– Death date
– Any other distinguishing factors (“is not”)
• If unique
– Immediate assignment
51

ISNI and ORCID
• ORCID numbers are a subset of the numbers in ISNI’s
database
• Working towards alignment, with ultimate goal of single
assignment
• There is ISNI representation on the ORCID Technical
Steering Group, and ORCID representation on the ISNI
Technical Committee
• A researcher may have both an ORCID and an ISNI
52

Thomas Hickey
Chief Scientist, OCLC Research
2015 February
NISO Webinar on Authority Control
VIAF Relations
VI
AF

Virtual International Authority File
• Grew out of collaboration with national libraries
• Implemented and run by OCLC
• VIAF Council helps oversee it
• ~36 files, mainly from national authority files
• Everything libraries control other than topical
subject headings is in scope
– Personals, corporates, families
– Jurisdictionals, geographics
– Works, expressions
– Imaginary characters, etc.
56

Why multiple files?
• Different
– Information collected
• Private vs. public
• Identification vs. comprehensive
– Technologies and systems
• APIs
– Time scales
• Batch vs. interactive creation
• Historical vs. contemporary
– Business models
62

VIAF’s characteristics
• Origins
• What is being identified
• Who creates it
• Range of entities
• Priorities and control
• What can be shared
Library authorities
Entities libraries control
Library staff
Very broad
Libraries
Open
63

Relationship with ISNI
• Both systems run by OCLC
– VIAF helped get ISNI started
• Problems
– Each absorbs the other’s data
– Feedback loops!
• Who’s in charge?
– ISNI now indicates reviewed records
• Relationships treated as though from xA
• Can both merge and split VIAF clusters

Relationship with Wikipedia
• VIAF Harvests Wikipedia dumps monthly
• Pages about people that are in VIAF are added
• VIAFbot back loaded links into Wikipedia
– http://en.wikipedia.org/wiki/User:VIAFbot

Relationship with WorldCat
• One of the main uses of VIAF internally at
OCLC is controlling names
• Multilingual Bibliographic Structure project
• Generate ‘xR’ authority records
– Works
– Expressions

OCLC Production Services
External OCLC Research Systems
Internal OCLC Research
Resources
enhanced
WorldCat
Kindred Works
Classify
Identities
FictionFinder
Cookbook
Finder
LCSH
FAST
VIAF
GMGPC
Linked Data Entities
WORKS
GSAFD
GTT
DDC
LCTGM
MeSH

enhanced
WorldCat
WORKS
xR
Sandbox
Multi-lingual
Bib Records
VI
AFFRBR
Clustering

Unexpected interactions
• Drive towards comprehensiveness
– More information about entities
– More entities
• Importing other files
• Keeping up with updates
• Recognizing source of information
• What to trust
• How to leverage limited staff

NISO Webinar • February 11, 2015
Questions?
All questions will be posted with presenter answers on
the NISO website following the webinar:
http://www.niso.org/news/events/2015/webinars/authority_control/
NISO Webinar
Authority Control:
Are You Who We Say You Are?

Thank you for joining us today.
Please take a moment to fill out the brief online survey.
We look forward to hearing from you!
THANK YOU

NISO Webinar Authority Control

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to NISO Webinar Authority Control

Similar to NISO Webinar Authority Control (20)

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Recently uploaded

Recently uploaded (20)

NISO Webinar Authority Control

Editor's Notes