Your SlideShare is downloading. ×
Hoping that… …Simeon has explained all about the name  authority problem I‟d like to talk about some of the work  that w...
Gross generalisation about pastapproaches to author identifiers       Libraries             PublishersBook-level data     ...
Current international activity         ISNI                                  ORCIDLibrary-instigated                 Publi...
Signs of convergence? Knowledge Exchange meeting on Digital  Author Identifiers in March 2012  encouraged alignment of IS...
Sources of information Both ORCID and ISNI will use existing pools  of information to populate their systems   ISNI: “Le...
National author ID systems 2011: JISC-funded survey and report on  national author/researcher identifier  systems around ...
Maturity of systems (late 2011)              System            In development since                 Number of identitiesLa...
Populating identifier systemsSystem                        Records created by   Records imported from Records generated by...
Good sources of data for some          nationsNational system      Existing unique identifiers                     Researc...
Features of mature national          identifier systems With more mature systems:   A national organisation generally ha...
SITUATION IN UK            JISC Conference, 2010
Work to investigate unique IDs      for UK researchers Identified in 2006 as part of the call for  proposals for the JISC...
The Names Project       The Chang Project„From the Annals of the Onomastic  Society‟Ian Watson (1990)
Names (not an acronym…) Name Authorities Make Everything Simpler Names: Ambiguous, Meaningful (or  Meaningless?), Essent...
Rhyming couples     JISC Conference, 2010
Original plan Use data from British Library‟s Zetoc service to  create author IDs    Journal article information from 19...
Revised plan Used 2008 Research Assessment Exercise  data (as cleaned up by JISC Merit project)  to pre-populate the Name...
JISC Conference, 2010
Building on Merit… Merit data covers around 20% of active UK  researchers Working to enhance records and create  new one...
Submission form     JISC Conference, 2010
http://separatedbyacommonlanguage.blogspot.com/2009/08/initials-and-names.html
Quality matters Automatic matching can only achieve so  much   Dependent on data source British Library team perform ma...
Ultimate aim High-quality set of unique identifiers for UK  researchers and research institutions Available to other sys...
Access to Names API allows for flexible searching of Names  data EPrints plugin released in 2011: allows  repository use...
JISC Conference, 2010
JISC Conference, 2010
Next steps… JISC-convened Researcher ID group – final  meeting in September > recommendations Options Appraisal Report f...
Summing up Names is a hybrid of library/publisher  approaches   Automated matching/disambiguation   Human quality check...
An evolving area Main challenges are cultural and political  rather than technical National author/researcher ID service...
Project updates Names: http://names.mimas.ac.uk Blog: http://namesproject.wordpress.com Twitter: @NamesProject         ...
How dinosaurs broke our system: challenges in building national researcher identifier services
How dinosaurs broke our system: challenges in building national researcher identifier services
Upcoming SlideShare
Loading in...5
×

How dinosaurs broke our system: challenges in building national researcher identifier services

639

Published on

Presentation on the Names Project given at the Open Repositories conference, 11 July 2012. <a>Video of this talk</a> on YouTube.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
639
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Deletable?
  • …and, I would say, are all very jealous of those countries with ready-made data sources like this…
  • Namey anecdote here? Dicky Moore &amp; Robin Armstrong Viner?
  • Known in name authority circles as ‘the Siveter problem’
  • Every time we add a new data set, the quality of the data within the Names pilot improves – recently added information from the University of the West of England – QA process highlighted a previously unnoticed problem with the original Merit data.
  • Transcript of "How dinosaurs broke our system: challenges in building national researcher identifier services"

    1. 1. How dinosaurs broke our system Challenges in building national researcher identifier services Amanda Hill Names ProjectJISC Conference, 2010
    2. 2. Hoping that… …Simeon has explained all about the name authority problem I‟d like to talk about some of the work that we‟ve done as part of the Names Project recently… …and how that fits into today‟s researcher identification landscape
    3. 3. Gross generalisation about pastapproaches to author identifiers Libraries PublishersBook-level data Article-level dataLabour intensive: Automatically generated:disambiguation first disambiguation laterAuthors not involved Authors can editOpen Proprietary
    4. 4. Current international activity ISNI ORCIDLibrary-instigated Publisher-instigatedDisambiguation first Disambiguation laterAuthors not involved Authors can submit/editBroad scope Current researchers JISC Conference, 2010
    5. 5. Signs of convergence? Knowledge Exchange meeting on Digital Author Identifiers in March 2012 encouraged alignment of ISNI and ORCID approaches ISNI has reserved a block of identifiers for use by ORCID JISC Conference, 2010
    6. 6. Sources of information Both ORCID and ISNI will use existing pools of information to populate their systems  ISNI: “Leveraging high confidence data from different domains”  “ORCID will link to other name identifier systems” JISC Conference, 2010
    7. 7. National author ID systems 2011: JISC-funded survey and report on national author/researcher identifier systems around the world  Report published November 2011 http://ie-repository.jisc.ac.uk/567/
    8. 8. Maturity of systems (late 2011) System In development since Number of identitiesLattes (Brazil) 1999 1,600,000 31,000 researchers at 160Frida/Cristin (Norway) 2003 institutions 24,400 faculty with profilesVIVO 2003 150,000 total IDs including undisambiguated co-authors 40,000 in the NTADigital Author Identifier 2005 (1980s for National Thesaurus 15,000 researchers with Digital(Netherlands) of Author Names) Author IDsNames Project (UK) 2007 46,000New Zealand Electronic Text 2007 2,000CentreTrove People andOrganisations/NLA Party 2007 900,000 people and organisationsInfrastructure (Australia)AuthorClaim 2008 200Researcher Name Resolver 2008 190,000(Japan)
    9. 9. Populating identifier systemsSystem Records created by Records imported from Records generated by cataloguers other systems data subjectsAuthorClaimDigital Author Identifier(Netherlands)Frida/Cristin (Norway)Lattes (Brazil)Names Project (UK)New Zealand Electronic TextCentreResearcher Name Resolver(Japan)Trove People andOrganisations/NLA PartyInfrastructure (Australia)VIVO
    10. 10. Good sources of data for some nationsNational system Existing unique identifiers Researcher identifiers from nationalJapan researcher databases Number from National Thesaurus ofNetherlands Author names is converted into Digital Author Identifier Human resources data: social securityNorway numbersOther national systems assign newidentifiers as new identities areestablished.
    11. 11. Features of mature national identifier systems With more mature systems:  A national organisation generally has oversight: e.g. in Brazil, Norway, Netherlands  Integration with research funders, reporting agencies and institutional repositories Individual institutions also have defined roles relating to managing information about their own staff
    12. 12. SITUATION IN UK JISC Conference, 2010
    13. 13. Work to investigate unique IDs for UK researchers Identified in 2006 as part of the call for proposals for the JISC-funded Repositories and Preservation Programme Mimas and the British Library proposed a two- year project to:  Investigate requirements for a UK name authority service  Build a pilot system to demonstrate potential
    14. 14. The Names Project The Chang Project„From the Annals of the Onomastic Society‟Ian Watson (1990)
    15. 15. Names (not an acronym…) Name Authorities Make Everything Simpler Names: Ambiguous, Meaningful (or Meaningless?), Essential, Symbolic …nearly everyone has a name-related story
    16. 16. Rhyming couples JISC Conference, 2010
    17. 17. Original plan Use data from British Library‟s Zetoc service to create author IDs  Journal article information from 1993->  Last names, initials, paper titles, subject classifications But…  International in scope  Lack of information on affiliations and first names to help with making matches  Huge dataset -> processing issues
    18. 18. Revised plan Used 2008 Research Assessment Exercise data (as cleaned up by JISC Merit project) to pre-populate the Names system  Identify unique individuals and assign identifiers Data quality good, included institutional information: high accuracy, despite only having initials, not full first names Except for… JISC Conference, 2010
    19. 19. JISC Conference, 2010
    20. 20. Building on Merit… Merit data covers around 20% of active UK researchers Working to enhance records and create new ones with information from other sources  Institutional repositories  British Library data sets (Zetoc)  Direct input from researchers
    21. 21. Submission form JISC Conference, 2010
    22. 22. http://separatedbyacommonlanguage.blogspot.com/2009/08/initials-and-names.html
    23. 23. Quality matters Automatic matching can only achieve so much  Dependent on data source British Library team perform manual check of results of matching new data sources  Allows for separation/merging of records Plan to allow people to update their own information
    24. 24. Ultimate aim High-quality set of unique identifiers for UK researchers and research institutions Available to other systems (national and international)  e.g. Names records exported to ISNI in 2011 Possible additional services  Disambiguation of existing data sets  Identification of external researchers
    25. 25. Access to Names API allows for flexible searching of Names data EPrints plugin released in 2011: allows repository users to choose from a list of Names identities  …and to create a Names record if none exists JISC Conference, 2010
    26. 26. JISC Conference, 2010
    27. 27. JISC Conference, 2010
    28. 28. Next steps… JISC-convened Researcher ID group – final meeting in September > recommendations Options Appraisal Report for UK national researcher identifier service > December Improving data and adding new records JISC Conference, 2010
    29. 29. Summing up Names is a hybrid of library/publisher approaches  Automated matching/disambiguation  Human quality checks  Data immediately available for re-use in other systems  Researchers can supply information
    30. 30. An evolving area Main challenges are cultural and political rather than technical National author/researcher ID services can be important parts of research infrastructure Getting agreement and co-ordination at national level is vital
    31. 31. Project updates Names: http://names.mimas.ac.uk Blog: http://namesproject.wordpress.com Twitter: @NamesProject JISC Conference, 2010

    ×