NISO Webinar: Identify This! Identify That! New Identifiers and New Uses


Published on

Just about everyone is familiar with the ISBN for books and the ISSN for serials. But new identifiers and new identifier standards have been developed for resources—such as the International Standard Text Code (ISTC)— and for people and organizations—such as the International Standard Name Identifier (ISNI). NISO's January 2012 webinar, Identify This! Identify That! New Identifiers and New Uses—to be held on January 11 from 1:00 to 2:30 p.m. EST—will discuss several new identifiers as well as new uses for older identifiers.

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The ISNI database went live in November 2011. The initial database, including contributions from VIAF (libraries), professional societies, trade organizations and rights management societies, contains just under 1 million assigned names. The ISNI operational architecture and assignment system is outlined, together with an overview of the various use cases for ISNI.
  • ISNI = International Standard Name Identifier. OCLC is a founding member of the ISNI-IA (International Agency) along with the Centre for European National Libraries, Rights Management Societies and Proquest. IFRRO, the International Federation of Reproduction Rights Organisations in Brussels represents 57 organisations, CISAC, International Confederation of Societies of Authors and Composers in Paris represents 225 societies and IPDA, the International Performers Database Association in Stockholm represents 35 societies world wide. CENL is represented by the BibliothèqueNationale de France and the British Library. Proquest is a large company, based in Michigan which among other things produced Books in Print and assigns ISBN and ISTC in North America. Data from the founding members, including data from the JISC names project is being collected into a database being managed by OCLC’s CBS system. OCLC is also contributing a copy of the VIAF file (Virtual International Authority File) as the base cross domain file of ISNI. Once the database has been established, later this year, OCLC’s CBS system will run the ISNI assignment system. ISNI will help link data within and across databases, thus providing the infrastructure for significantly improved name searching and linking.  Moreover, by sharing their data resources, ISNI participants from libraries, rights management and trade organisations are co-operating to achieve high quality data and at the same time realise processing efficiencies.  This unprecedented cross-sector alliance is very exciting.
  • Cooperation possible– role, contribution, specifications, building initial database, business model +++
  • The base file of the ISNI database
  • Thank you for giving me the opportunity to update you on ORCID and the many researcher identifier initiatives out there.
  • While working on ORCID, I have learned that there are three times when researchers seem to care the most about their researcher identity…This marries up well to the three main constituents of ORCID: Academia, Funders, Publishers.
  • Here is the main way in which we see publishers and their authors interacting with ORCID. This will allow publishers to light up all of those author links with information about authors.
  • We have publicly announced our 10 principles. They govern everything that ORCID does.
  • Here are some of our newest participants: Highwire, BMJ Group, Frontiers and Journal of Bone and Joint Surgery
  • We are seeing about 12 new participants every month. Academic institutions are the fastest growing segment.
  • We currently have participants from 38 different countries within ORCID.
  • Every progress report deserves a Gantt chart. You can see that 2010 was mostly dedicated to establishing the not-for-profit organization, defining our principles and scope, and testing some preliminary (alpha) software.
  • 2011 is about building the system and defining the business using sponsorship to stay afloat. If all goes well we will have our first phase system ready in early 2012 in time for the ORCID board to take a decision about going live. This is a big decision as once we start registering ORCIDs, we are committed to making them persistent.
  • Here is a brief look at ORCID Phase 1.
  • Researchers will be able to register themselves. Institutions will also be able to so on their behalf. Ultimate control of the record will be subject to privacy settings.
  • (Credit: Geoff Bilder)ORCID will ultimately combine the strengths of self-asserted, socially-validated, and organizationally-asserted identity systems.self-asserted identity systems are familiar to us from the internet, where most non-commerce systems are self-asserted. In other words, the subject chooses what to say about themselves. In ORCID- researchers will be able to edit and manage their own profiles.socially-validated identity systems work by exploiting others in the network to provide a check on self-assertion. So, for instance, popular technical advice sites like “Stack Overflow” use peers to check and validate advice given by other members of the community. ORCID- being an system open to all researchers- will enable researchers to inspect each others claims and assertions. organizationally-asserted identity are the gold standard of identity, in that they are considered to be the most reliable- but also the most expensive to run (think DMV or passport office). Still ORCID will also encourage organizations to make assertions about researchers as well, such as:a) Brown University asserts that Josiah Carberry is a faculty memberb) Mellon Foundation asserts Josiah Carberry was awarded a grantc) Nature asserts that Josiah Carberry was the author of this paperBy exploiting the best of these different identity approaches- we aim to distribute the work of disambiguation and quality control amongst all the major ORCID stakeholders.
  • ORCID plans to be the one to bridge the other scholarly author identification systems by registering the identifiers of all other relevant standalone services (silos big and small)
  • So what makes ORCID different?
  • Thank you!
  • NISO Webinar: Identify This! Identify That! New Identifiers and New Uses

    1. 1. Identify This! Identify That!New Identifiers and New Uses January 11, 2012 Speakers: Roy Crego, JaniferGatenby, and Chris Shillum
    2. 2. ISTCThe Identifier for Textual Works Roy Crego 11 January 2012
    3. 3. What is the ISTC?• A global identification system for textual works• ISO Standard (21047) since March 2009• 16=digit alphanumeric code - Digits 1-3 = Registration Agency Identifier - Digits 4-7 = Year of registration - Digits 8-15 = Title number - Digit 16 = Check digit Example: A03-2011-0000010B-1 2 A member of the ProQuest family
    4. 4. ISTC Key Features Hierarchical Central Collocation Structure Registration • Groups • Uses Source • The same ISTC products ISTC to tie is used on containing the derivative same text same textual products from all work (abridged publishers audio, Spanish across all translation, countries 2nd edition) to parent work 3 A member of the ProQuest family
    5. 5. ISTC International•10 ISTC Registration Agencies representing 6 languages to date: •BoekenBank (Flemish) • Bowker (English) • BTLF (Canada) (French) • CCR (Chinese) •Cercle de la Librairie-Electre (French) • FIRP (Russian) • MVB (German) • Nielsen UK (English) • Nielsen-New Zealand (English) • Thorpe-Bowker (Australia) (English)•Central ISTC Database (STRS) went live in 2009.• New ISTC-International Site ( launched July 2011 with public ISTC lookup. 4 A member of the ProQuest family
    6. 6. ISTC Lookup Utility 5 A member of the ProQuest family
    7. 7. The Need? • Over 3 million ISBNs were assigned in the US in 2010. • E-books, reprints, and self-publishing are adding to that number daily. • Metadata for similar products needs to be linkedThe ISTC Solution • ISTC designed for use by computers on the web • ONIX for ISTC format (ver. 1.01) • ISNI Support within ContributorIDType tag • ONIX for Books support • Ver. 2.1 WorkIdentifier • Ver. 3.0 Work Relation Code (01=contains work; 02=source work) • MARC Discussion Paper (2010-DP04) • Consensus to use 024 and 787 fields 6 A member of the ProQuest family
    8. 8. Find Same Text in Many Formats/ISBNs Breaking Dawn ISTC A03-2010-0000000F-F 9781616579203 9780606231084 9780316176156 9780316032834 Audiobook Hardcover Paper E-Book A03-2010-0000000F-F A03-2010-0000000F-F A03-2010-0000000F-F A03-2010-0000000F-F 7 A member of the ProQuest family
    9. 9. Find Derivative Versions of the Text The Red Badge of Courage (1895) ISTC A03-2010-00000093-C 9780393930757 9781603390354 9780756958107 Text with Criticism Abridged CD-Audio Adapted Graphic Novel ISTC A03-2010-00000092-9 ISTC A03-2010-0000008F-7 ISTC A03-2010-00000099-E Source ISTC A03-2010-00000093-C Source ISTC A03-2010-00000093-C Source ISTC A03-2010-00000093-C 8 A member of the ProQuest family
    10. 10. •Find Translations and Title Variations 2 Spanish Translations 9 A member of the ProQuest family
    11. 11. Rollout • Bowker is storing ISTCs in Books In Print for distribution when “critical mass” is achieved. • Feedback needed on what is “critical mass.” • Publishers, libraries and retailers all need to encourage the use of ISTC in product records. • ISTC Registry data available (by license) through web call. 10 A member of the ProQuest family
    12. 12. Additional Information: Bowker ISTC Agency Roy Crego Product Manager ISTC, ISNI, DOI (908) 219-0240 International ISTC Agency ISTC Search Page Specifications User Manual 11 A member of the ProQuest family
    13. 13. ISNI:Scope, Participation and OperationNISO Webinar “Identify this, identify that: New identifiers and new uses January 11th 2012 Janifer Gatenby, OCLC EMEA
    14. 14. At January 2012 • ISNI is an ISO Standard ready for publication (FDIS) • ISNI-IA incorporated in the UK in December 2010 • ISO contract agreed; in process of signature • OCLC (Leiden) appointed as the ISNI-IA Assignment Agency • Created the initial ISNI-IA database (2011) • First million ISNIs assigned and being diffused to data contributors • Ongoing assignment operations in test, to start Q1 2012
    15. 15. The ISNI-IA Founding Members • CENL (Centre of European National Libraries) • 48 European national libraries, represented by BL and BnF • CISAC (International Society of Authors and Composers) • 225 societies from 118 countries (87% musical composers) • IFRRO (International Federation of Reproduction Rights Organisations) • 135 organisations from 74 countries • IPDA (International Performers Database Association) • 37 societies from 28 countries • OCLC • Proquest/Bowker (BIP)
    16. 16. Quality team VIAF ISNIRights Management Board Libraries, Education, Organisations Publishing, Trade
    17. 17. Why have ISNI? • To differentiate one Will Smith from the other • International and Cross Disciplinary in Scope
    18. 18. NOISE Kevin Smith & Will FerrellTypical result of a web keyword search on Will SmithRetrieves noise titlesDoes not retrieve titles by the Fresh Prince NOISE Kevin Smith & “Will” in title Thanks to David Grundy, ALCS
    19. 19. Leading to wrong recommendations Dorothy Wood, the Scottish lady who writes craft works did not write about Danish immigrants to South Australia
    20. 20. More use cases for ISNI • Rights management societies often receive payments then need to research to find the rights holders • As digital grows, need is greater and requirement is international • Need the system but are bound by data privacy • Can use VIAF as interfacing public domain data • Shared work load = efficiencies + higher quality • Researchers need identifiers • Reputation management • Grant applications
    21. 21. More use cases for ISNI • European Arrow Project • Envisages using ISNI with ISTC for registration of digitisation rights information • Supply Chain • NISO I2 committee recommended the ISNI Assignment system for all institutions in the Digital Supply Chain • Music Industry • Need for a unique global level identifier shared by record labels and exposure sites
    22. 22. ISNI Scope • International • Cross domain • Including creators and other contributors in all disciplines (authors, editors, translators, illustrators, composers, actors, performers, artists) • And all organisations involved in the distribution and supply chain of created works (publishers, intermediaries, retailers, database vendors, libraries) • Must have centralised registration • global network of registration agencies • Cannot use ISBN model of allocation of ranges
    23. 23. ISNI’s relationship with ORCID • ISNI-IA is advocating one shared scheme • Confusing to have 2+ identifier schemes appearing at the same time for the same identities • Dilutes effect of linked data • Corrections easier to administer with one scheme • Cross domain identification – e.g. writer of scientific articles also book author and song writer • ISNI linking identifier in ISO (ISBN, ISTC, ISAN, ISWC ++) • Interoperating systems • Method to be negotiated; aim for SYNERGY • ORCID’s focus is on end user input; ISNI’s is on registration
    24. 24. The ISNI database • Free public access to core metadata of Assigned ISNIs: • via 4 indexes • SRU:, RDF coming) • Member access (RAG and Data Contributors) • Access to full metadata, except confidential data • Access to all records, assigned and unassigned via 20 indexes • Ability to add and modify own data • Administrator and Quality Team • Full access, additional indexes
    25. 25. • VIAF (Virtual International Authority File) – 12 million+ authority records – from 26 national and major research libraries – Harvested and managed by OCLC Research; matching algorithms refined over 5-6 yearsBase cross domain file of the ISNI database, building on work already done 25
    26. 26. Leveraging high confidence data fromdifferent domains• British Library • JISC names (research grant data), UK theses, ZETOC (Possible)• CISAC • IPI – International Party Identifier (87% musical composers), 2 million records• IPDA (International Performers Database Association) • 500,000 performer records• IFRRO (International Federation of Reproduction Rights Organisations) • Including access© (Canada), ALCS (UK), CEDAR (Netherlands), CEDRO (Spain), Librius (Belgium), Prolitteris (Switzerland), VGWort (Germany) (135 orgs from 74 countries)• Proquest/Bowker • BIP (books in print), Theses and Scholar Universe, American professional societies
    27. 27. ISNI Assignment• An ISNI is assigned where: • Metadata from 2 or more independent sources matches with a sufficient level of confidence (match confidence) • Matching adapted from VIAF matching • Or metadata has 3 or more VIAF sources • Metadata is complete and unambiguous (special rules for sparseness and for common surnames)• All records have a data confidence level • Indicating closeness of contact with party behind the identifier • Rights management societies have highest confidence
    28. 28. ISNI Assignment – Matching• Personal names – Primary match fields • Name, name identifiers, dates, titles of resources, title identifiers, co-authors• Personal names - Secondary match fields • Institution affiliations, publishers, nationality, gender, partial titles, experimenting with Dewey classification of titles• Organisations – Primary match fields • Name, name identifiers, address, organisation type, active dates, associated persons (e.g. Band members)• Organisations – Secondary match fields • Titles of resources, affiliated organisations• Series of independent judges with an overall score
    29. 29. ISNI Assignment – Possible matches • where the matching score is between the no match threshold and the match threshold • where the incoming record have different local identifiers for the same source (could signal either a merge or a split) • where the incoming and match record disagree on birth or death date • where the incoming and match record share the same local identifier but their similarity is below threshold
    30. 30. Quality teamISNI Database Quality • Team at Bibliothèquenationale de France and the British Library (Anila Angjeli, Michael Docherty, Nicole Druet, Andrew MacEwan, Richard Moore and Alison Wood) • Manual checking of statistical samples, resulting in numerous matching algorithm improvements • Review by data source – input to data policies • Establishing the percentage of data errors & characteristics • Resolution of queries from RAGs and general public • Program analysis • Creation of anomaly checker, e.g. publishing before 10 years old • Diffusion of corrections
    31. 31. Centralised Registration / Diffused Collectionand Management • Network of nodes and expertise • RAGs for gathering, completing, assessing the quality of input data , responding to disambiguation responses • Verification Agencies and Reference Databases for input to disambiguation, matching, assignment and correction • RAGs and Verification Agencies / Reference Databases for diffusing ISNIs and promoting usage • Assignment Agency stores URLs in database and sends notifications of changes and corrections • All ISNI members responsible for Quality; special role for the Quality Team
    32. 32. VIAF ISNI From RAGs: To RAGs:• ISNI requests • ISNI enquiry responses • Batch load • ISNI Responses • Web and Atom Pub API • Reports and Notifications• ISNI enquiries • Service Desk responses• Service issues to Service Desk • Relay Q issues to Quality Team
    33. 33. VIAF ISNI From General Public: Modules:• ISNI enquiries (Web and SRU) • Web enquiry (filtered)• Data quality issues from sent to • Downloadable search box Quality team • SRU (filtered)• At least one RAG will accept public • Suggestions via requests on the ISO RAND principle
    34. 34. In conclusion • ISNI-IA • Not for profit, incorporated in the UK, unprecedented cross domain alliance • Funded the creation of the database and assignment system • Assignment Agency and RAGs also on RAND cost recovery • Ongoing costs are modest – no permanent staff, permitting price per ISNI to be as low as possible • Emphasis on registration and data quality, building on existing data • Data privacy is respected while core data is open • diffusion and usage of assigned ISNIs is free and is encouraged
    35. 35. ORCID:Open Researcher and Contributor ID NISO Webinar: Identify This! Identify That! 11 January 2012 Chris Shillum Member, ORCID Board and VP Product Management, Platform and Content, Elsevier
    36. 36. ProblemEstimated 27 million researchers world-wideincluding private, government and research institution, not studentsMany researchers have common names, change names, or are otherwise difficult to uniquely and unambiguously identify
    37. 37. ORCID will work to support the creation of apermanent, clear and unambiguous record ofscholarlycommunication byenabling reliableattribution of authorsand contributors
    38. 38. By issuing unique identifiers to all researchers,ORCID will facilitate discovery and evaluation forresearchers, institutions, scholarly societies andpublishers Joins faculty or student body Joins scholarly society Applies for grant 46533489 Submits manuscript
    39. 39. ORCID will Meet Needs for Multiple Stakeholders in the Scholarly Communication Chain Why? Help track output ofResearchers Academic faculty and students Institutions Applies for Help perform research grant assessment of grantees Funding Bodies Streamline data input Creates author link - to publications - to collaborators Publishers - to other forms of communication
    40. 40. Researcher Profile UpdatedORCID<->DOI pairings Researcher submitted to ORCID Registers Import Existing Profile (e.g. Scopus Author ID) Author – ORCID – PublisherMetadata, along with ORCID Interaction during ORCID passed to manuscript submissiondeposited to CrossRef Manuscript Submission system Publisher Content Published Manuscript processed
    41. 41. ORCID History• ORCID Initiative started in late 2009, based on initial proposal by Nature Publishing Group and Thomson Reuters• ORCID non-profit organization with a Board of Directors created in August 2010.• Now over 275 participants from all sectors in scholarly communication• Launch of service planned for the first quarter of 2012.
    42. 42. ORCID Principles These 10 Principles were adopted unanimously by the ORCID Board in December 2010:1. ORCID will work to support the creation of a permanent, clear and unambiguous record of scholarly communication by enabling reliable attribution of authors and contributors.2. ORCID will transcend discipline, geographic, national and institutional, boundaries.3. Participation in ORCID is open to any organization that has an interest in scholarly communications.4. Access to ORCID services will be based on transparent and non-discriminatory terms posted on the ORCID website.5. Researchers will be able to create, edit, and maintain an ORCID ID and profile free of charge.
    43. 43. ORCID Principles6. Researchers will control the defined privacy settings of their own ORCID profile data.7. All profile data contributed to ORCID by researchers or claimed by them will be available in standard formats for free download (subject to the researchers own privacy settings) that is updated once a year and released under the CC0 waiver.8. All software developed by ORCID will be publicly released under an Open Source Software license approved by the Open Source Initiative. For the software it adopts, ORCID will prefer Open Source.9. ORCID identifiers and profile data (subject to privacy settings) will be made available via a combination of “no charge” and “for a fee” APIs and services. Any fees will be set to ensure the sustainability of ORCID as a not- for-profit, charitable organization focused on the long-term persistence of the ORCID system.10. ORCID will be governed by representatives from a broad cross-section of stakeholders, the majority of whom are not-for-profit, and will strive for maximal transparency by publicly posting summaries of all board meetings and annual financial reports.
    44. 44. 275+ Participants
    45. 45. NewestParticipants(Sep-Oct 2011)
    46. 46. ORCID is open to any organization with aninterest in scholarly communication
    47. 47. ORCID transcends discipline, geographic,national and institutional boundaries 250 participating organizations as of 1 August 2011
    48. 48. Board of DirectorsRepresents a broad cross-section of stakeholders with the majority being not-for-profit Publishers Academia Funding Bodies
    49. 49. Timeline 2010Feb March April May June July Aug Sept Oct Nov Dec Build Sandbox Alpha Prototyping ORCID Members Demonstration and Alpha Testing Organization Creation Wellcome /MIT Survey Principles/Scope Defined Alpha Testing Profile Exchange Research & Development
    50. 50. Timeline 2011-12Q1 2011 Q2 2011 Q3 2011 Q4 2011 Q1 2012 Q2 2012 Q3 2012 Q4 2012 Researcher ID Code Build Phase 1 - Semantico Licenced from TR Start Registering ORCIDs Build Phase 2Sponsorship Drive 1 Sponsorship Drive 2 Staff Hired Start Collecting Fees $244K $250 Goal VIVO Technology Research Mellon Marketing Research Profile Exchange Research & Development
    51. 51. Development Progress: ApproachAlpha Phase 1 Phase 1.x and 2• Completed Spring 2010 • Development underway based code • Development 2012+• Self-claim oriented donacted by Thomson Reuters • Will address assertions by wide group• Limited light integration with a few • Development by Semantico under of third parties participant services contract with ORCID • Will extend capabilities for alternate• Demonstration capabilities • Development led by Geoff Bilder roles and other types of contributions transitioning to ORCID source code • Will provide core for future • Will provide mechanisms for by end of year production service automatic de-duplication of third • Will focus on currently active party donated records researchers
    52. 52. Phase 1 System ScopeORCID will build a central registry of unique identifiers for researchers and scholarswith the following scope: – ORCID will focus on currently active researchers – Data will come from individuals and organizations – ORCID will be a hybrid system of self- and organization-asserted identity – Data collected will be those needed for disambiguation - extra data for optionally creating full CV-like profiles might be added in the future – System will provide basic matching and disambiguation of names – ORCID system will, from the start, enable 3rd parties to build value added services using ORCID infrastructure – ORCID services will be developed based on the needs of the ORCID communityPhase 1 System will be based on TR Researcher ID software, but – All references to Researcher ID removed – Identifier scheme changed
    53. 53. Types of Identity AssertionSelf-Asserted Socially-Validated Organisationally-ValidatedIdentity Identity Identity Self-Asserted Identity Organisationally-Validated Socially-Validated Identity IdentityKaliya Hamlin.
    54. 54. ORCID Phase 2 – Disambiguation Concept Self-asserted +socially-validated +organizationally-asserted identity =more credible assertion Self-Asserted Socially-Validated Organizationally-Validated Identity Identity Identity Disambiguated IdentityGeoff Bilder, “Disambiguation without de-duplication: Modeling authority andtrust in the ORCID system”:
    55. 55. What Makes ORCID Different?• Only not-for-profit contributor identifier initiative dedicated to an open and global service focused on scholarly communication• ORCID is backed by a non-profit organization with over 275 participants behind it• ORCID is backed by many different stakeholders• Publishers are an important ORCID stakeholder but are just one part• ORCID is serious about building an open system• ORCID is the only researcher identifier that is not limited to discipline, institution or geographic area• ORCID is the one to bridge them all by registering the identifiers of all other relevant standalone services (silos big and small)
    56. 56. Come Join UsRegister as a participant at Follow us on Twitter: @orcid Attend the next Outreach Meeting: 17 May 2012 in Cambridge, MA