The document discusses the Names Project in the UK, which aims to create unique identifiers for UK researchers. It began in 2007 using data from a research assessment exercise. The Names Project takes a hybrid approach, using automated matching and manual disambiguation. It also allows researchers to directly input information. The project seeks to improve data quality and integrate with other national and international identifier systems like ISNI. Key challenges include gaining agreement on national researcher identifier services.
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Paolo Nesi
A number of accessible RDF stores are populating the linked open data world. The navigation on data reticular relationships is becoming every day more relevant. Several knowledge base present relevant links to common vocabularies while many others are going to be discovered increasing the reasoning capabilities of our knowledge base applications. In this paper, the Linked Open Graph, LOG, is presented. It is a web tool for collaborative browsing and navigation on multiple SPARQL entry points. The paper presented an overview of major problems to be addressed, a comparison with the state of the arts tools, and some details about the LOG graph computation to cope with high complexity of large Linked Open Dada graphs. The LOG.disit.org tool is also presented by means of a set of examples involving multiple RDF stores and putting in evidence the new provided features and advantages using dbPedia, Getty, Europeana, Geonames, etc. The LOG tool is free to be used, and it has been adopted, developed and/or improved in multiple projects: such as ECLAP for social media cultural heritage, Sii-Mobility for smart city, and ICARO for cloud ontology analysis, OSIM for competence / knowledge mining and analysis. Keywords LOD, LOD browsing, knowledge base browsing, SPARQL entry points.
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Paolo Nesi
A number of accessible RDF stores are populating the linked open data world. The navigation on data reticular relationships is becoming every day more relevant. Several knowledge base present relevant links to common vocabularies while many others are going to be discovered increasing the reasoning capabilities of our knowledge base applications. In this paper, the Linked Open Graph, LOG, is presented. It is a web tool for collaborative browsing and navigation on multiple SPARQL entry points. The paper presented an overview of major problems to be addressed, a comparison with the state of the arts tools, and some details about the LOG graph computation to cope with high complexity of large Linked Open Dada graphs. The LOG.disit.org tool is also presented by means of a set of examples involving multiple RDF stores and putting in evidence the new provided features and advantages using dbPedia, Getty, Europeana, Geonames, etc. The LOG tool is free to be used, and it has been adopted, developed and/or improved in multiple projects: such as ECLAP for social media cultural heritage, Sii-Mobility for smart city, and ICARO for cloud ontology analysis, OSIM for competence / knowledge mining and analysis. Keywords LOD, LOD browsing, knowledge base browsing, SPARQL entry points.
Today libraries face more and new challenges when enabling access to information. The growing amount of information in combination with new non-textual media-types demands a constant changing of grown workflows and standard definitions. Knowledge, as published through scientific literature, is the last step in a process originating from primary scientific data. These data are analysed, synthesised, interpreted, and the outcome of this process is published as a scientific article. Access to the original data as the foundation of knowledge has become an important issue throughout the world and different projects have started to find solutions.
Nevertheless science itself is international; scientists are involved in global unions and projects, they share their scientific information with colleagues all over the world, they use national as well as foreign information providers.
When facing the challenge of increasing access to research data, a possible approach should be global cooperation for data access via national representatives:
* a global cooperation, because scientists work globally, scientific data are created and accessed globally.
* with national representatives, because most scientists are embedded in their national funding structures and research organisations.
DataCite was officially launched on December 1st 2009 in London and has 12 information institutions and libraries from nine countries as members. By assigning DOI names to data sets, data becomes citable and can easily be linked to from scientific publications.
Data integration with text is an important aspect of scientific collaboration. DataCite takes global leadership for promoting the use of persistent identifiers for datasets, to satisfy the needs of scientists. Through its members, it establishs and promotes common methods, best practices, and guidance. The member organisations work independently with data centres and other holders of research data sets in their own domains. Based on the work of the German National Library of Science and Technology (TIB) as the first DOI-Registration Agency for data, DataCite has registered over 850,000 research objects with DOI names, thus starting to bridge the gap between data centers, publishers and libraries.
This presentation will introduce the work of DataCite and give examples how scientific data can be included in library catalogues and linked to from scholarly publications.
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...OpenAIRE
OpenAIRE Interoperability Workshop (8 Feb. 2013).
DataCite – Bridging the gap and helping to find, access and reuse data – Herbert Gruttemeier, INIST-CNRS
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!
ViBRANT—Virtual Biodiversity Research and Access Network for TaxonomyVince Smith
Presented by Dave Roberts and coauthored by Vince Smith at BioIdentify 2010, the National Muséum of Natural History (MNHN), Paris, France. 20-22 Sept, 2010.
OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research
Presentation used as scene setting for 2 days worth of discussion around library, archive & museum convergence, metadata workflows and single search at the University of Calgary.
Just about everyone is familiar with the ISBN for books and the ISSN for serials. But new identifiers and new identifier standards have been developed for resources—such as the International Standard Text Code (ISTC)— and for people and organizations—such as the International Standard Name Identifier (ISNI). NISO's January 2012 webinar, Identify This! Identify That! New Identifiers and New Uses—to be held on January 11 from 1:00 to 2:30 p.m. EST—will discuss several new identifiers as well as new uses for older identifiers.
An introduction to the Joint Information Systems Committee Resource Discovery iKit. Includes a look at controlled vocabularies declared in the Resource Discovery Framework (RDF)/Simple Knowledge Organisation System (SKOS) and wikipedia entries. Presented by Tony Ross at the CILIPS Centenary Conference Branch and Group Day which took place 5 Jun 2008.
Semantic Linking & Retrieval for Digital LibrariesStefan Dietze
An overview of recent works on entitiy linking and retrieval in large corpora, specifically bibliographic data. The works address both traditional Linked Data and knowledge graphs as well as data extracted from Web markup, such as the Web Data Commons.
Today libraries face more and new challenges when enabling access to information. The growing amount of information in combination with new non-textual media-types demands a constant changing of grown workflows and standard definitions. Knowledge, as published through scientific literature, is the last step in a process originating from primary scientific data. These data are analysed, synthesised, interpreted, and the outcome of this process is published as a scientific article. Access to the original data as the foundation of knowledge has become an important issue throughout the world and different projects have started to find solutions.
Nevertheless science itself is international; scientists are involved in global unions and projects, they share their scientific information with colleagues all over the world, they use national as well as foreign information providers.
When facing the challenge of increasing access to research data, a possible approach should be global cooperation for data access via national representatives:
* a global cooperation, because scientists work globally, scientific data are created and accessed globally.
* with national representatives, because most scientists are embedded in their national funding structures and research organisations.
DataCite was officially launched on December 1st 2009 in London and has 12 information institutions and libraries from nine countries as members. By assigning DOI names to data sets, data becomes citable and can easily be linked to from scientific publications.
Data integration with text is an important aspect of scientific collaboration. DataCite takes global leadership for promoting the use of persistent identifiers for datasets, to satisfy the needs of scientists. Through its members, it establishs and promotes common methods, best practices, and guidance. The member organisations work independently with data centres and other holders of research data sets in their own domains. Based on the work of the German National Library of Science and Technology (TIB) as the first DOI-Registration Agency for data, DataCite has registered over 850,000 research objects with DOI names, thus starting to bridge the gap between data centers, publishers and libraries.
This presentation will introduce the work of DataCite and give examples how scientific data can be included in library catalogues and linked to from scholarly publications.
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...OpenAIRE
OpenAIRE Interoperability Workshop (8 Feb. 2013).
DataCite – Bridging the gap and helping to find, access and reuse data – Herbert Gruttemeier, INIST-CNRS
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!
ViBRANT—Virtual Biodiversity Research and Access Network for TaxonomyVince Smith
Presented by Dave Roberts and coauthored by Vince Smith at BioIdentify 2010, the National Muséum of Natural History (MNHN), Paris, France. 20-22 Sept, 2010.
OCLC Research @ U of Calgary: New directions for metadata workflows across li...OCLC Research
Presentation used as scene setting for 2 days worth of discussion around library, archive & museum convergence, metadata workflows and single search at the University of Calgary.
Just about everyone is familiar with the ISBN for books and the ISSN for serials. But new identifiers and new identifier standards have been developed for resources—such as the International Standard Text Code (ISTC)— and for people and organizations—such as the International Standard Name Identifier (ISNI). NISO's January 2012 webinar, Identify This! Identify That! New Identifiers and New Uses—to be held on January 11 from 1:00 to 2:30 p.m. EST—will discuss several new identifiers as well as new uses for older identifiers.
An introduction to the Joint Information Systems Committee Resource Discovery iKit. Includes a look at controlled vocabularies declared in the Resource Discovery Framework (RDF)/Simple Knowledge Organisation System (SKOS) and wikipedia entries. Presented by Tony Ross at the CILIPS Centenary Conference Branch and Group Day which took place 5 Jun 2008.
Semantic Linking & Retrieval for Digital LibrariesStefan Dietze
An overview of recent works on entitiy linking and retrieval in large corpora, specifically bibliographic data. The works address both traditional Linked Data and knowledge graphs as well as data extracted from Web markup, such as the Web Data Commons.
Similar to How dinosaurs broke our system: challenges in building national researcher identifier services (20)
Slides for a presentation at the conference of the Association of Canadian Archivists in Victoria, British Columbia, in June 2014. The talk was about an event aimed at bringing communities together. It grew out of a finding aid of historical documents which had been used to support a First Nations land claim in Eastern Ontario (http://www.archeion.ca/culbertson-tract-land-claim-supporting-documents-collection;rad).
Presentation given at an Archives Association of Ontario Professional Development Committee workshop on February 7th, 2014. Explains how to create records describing archive creators and the archives themselves using Archeion, Ontario's archival network, which runs on the AtoM software from Artefactual Systems.
An introduction to using archives for family historians, presented on May 4th, 2013, at a one-day conference organized by the Toronto branch of the Ontario Genealogical Society.
Introduction to arrangement and description (feb 4&5, 2012)Amanda Hill
Slide presented at the 'Introduction to Arrangement and Description' workshop at the University of Guelph on February 4 and 5, 2012. They include an overview of key elements of the Rules for Archival Description and an introduction to creating descriptions for the new Archeion service.
Exploring Strange New Worlds: Archives TNGAmanda Hill
Presentation on the impact of using Web 2.0 technologies in a small municipal archives, given at the Association of Canadian Archivists' conference in Halifax, Nova Scotia, June 2010.
A presentation given at the conference entitled Archives 2.0: Shifting Dialogues between Users and Archivists at Manchester, 19-20 March 2009.
A video of the talk is also available at http://blogs.ukoln.ac.uk/cultural-heritage/2009/04/09/talk-on-archives-on-a-micro-scale/.
Presentation given at <a href="http://www.jisc.ac.uk/whatwedo/themes/access_management/federation/federation_events/programmtgjune08.aspx">JISC Identity Management: Future Directions Day</a>, 30 June 2008
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
How dinosaurs broke our system: challenges in building national researcher identifier services
1. How dinosaurs broke our system
Challenges in building national researcher
identifier services
Amanda Hill
Names Project
JISC Conference, 2010
2. Hoping that…
…Simeon has explained all about the name
authority problem
I‟d like to talk about some of the work
that we‟ve done as part of the Names
Project recently…
…and how that fits into today‟s researcher
identification landscape
3. Gross generalisation about past
approaches to author identifiers
Libraries Publishers
Book-level data Article-level data
Labour intensive: Automatically generated:
disambiguation first disambiguation later
Authors not involved Authors can edit
Open Proprietary
4. Current international activity
ISNI ORCID
Library-instigated Publisher-instigated
Disambiguation first Disambiguation later
Authors not involved Authors can submit/edit
Broad scope Current researchers
JISC Conference, 2010
5. Signs of convergence?
Knowledge Exchange meeting on Digital
Author Identifiers in March 2012
encouraged alignment of ISNI and ORCID
approaches
ISNI has reserved a block of identifiers for
use by ORCID
JISC Conference, 2010
6. Sources of information
Both ORCID and ISNI will use existing pools
of information to populate their systems
ISNI: “Leveraging high confidence data from
different domains”
“ORCID will link to other name identifier
systems”
JISC Conference, 2010
7. National author ID systems
2011: JISC-funded survey and report on
national author/researcher identifier
systems around the world
Report published November 2011
http://ie-repository.jisc.ac.uk/567/
8. Maturity of systems (late 2011)
System In development since Number of identities
Lattes (Brazil) 1999 1,600,000
31,000 researchers at 160
Frida/Cristin (Norway) 2003
institutions
24,400 faculty with profiles
VIVO 2003 150,000 total IDs including
undisambiguated co-authors
40,000 in the NTA
Digital Author Identifier 2005 (1980s for National Thesaurus
15,000 researchers with Digital
(Netherlands) of Author Names)
Author IDs
Names Project (UK) 2007 46,000
New Zealand Electronic Text
2007 2,000
Centre
Trove People and
Organisations/NLA Party 2007 900,000 people and organisations
Infrastructure (Australia)
AuthorClaim 2008 200
Researcher Name Resolver
2008 190,000
(Japan)
9. Populating identifier systems
System Records created by Records imported from Records generated by
cataloguers other systems data subjects
AuthorClaim
Digital Author Identifier
(Netherlands)
Frida/Cristin (Norway)
Lattes (Brazil)
Names Project (UK)
New Zealand Electronic Text
Centre
Researcher Name Resolver
(Japan)
Trove People and
Organisations/NLA Party
Infrastructure (Australia)
VIVO
10. Good sources of data for some
nations
National system Existing unique identifiers
Researcher identifiers from national
Japan
researcher databases
Number from National Thesaurus of
Netherlands Author names is converted into
Digital Author Identifier
Human resources data: social security
Norway
numbers
Other national systems assign new
identifiers as new identities are
established.
11. Features of mature national
identifier systems
With more mature systems:
A national organisation generally has oversight: e.g. in
Brazil, Norway, Netherlands
Integration with research funders, reporting agencies
and institutional repositories
Individual institutions also have defined roles
relating to managing information about their own
staff
13. Work to investigate unique IDs
for UK researchers
Identified in 2006 as part of the call for
proposals for the JISC-funded Repositories
and Preservation Programme
Mimas and the British Library proposed a two-
year project to:
Investigate requirements for a UK name authority
service
Build a pilot system to demonstrate potential
14. The Names Project
The Chang Project
„From the Annals of the Onomastic
Society‟
Ian Watson (1990)
15. Names (not an acronym…)
Name Authorities Make Everything Simpler
Names: Ambiguous, Meaningful (or
Meaningless?), Essential, Symbolic
…nearly everyone has a name-related
story
17. Original plan
Use data from British Library‟s Zetoc service to
create author IDs
Journal article information from 1993->
Last names, initials, paper titles, subject
classifications
But…
International in scope
Lack of information on affiliations and first names to
help with making matches
Huge dataset -> processing issues
18. Revised plan
Used 2008 Research Assessment Exercise
data (as cleaned up by JISC Merit project)
to pre-populate the Names system
Identify unique individuals and assign
identifiers
Data quality good, included institutional
information: high accuracy, despite only
having initials, not full first names
Except for…
JISC Conference, 2010
21. Building on Merit…
Merit data covers around 20% of active UK
researchers
Working to enhance records and create
new ones with information from other
sources
Institutional repositories
British Library data sets (Zetoc)
Direct input from researchers
24. Quality matters
Automatic matching can only achieve so
much
Dependent on data source
British Library team perform manual check of
results of matching new data sources
Allows for separation/merging of records
Plan to allow people to update their own
information
25. Ultimate aim
High-quality set of unique identifiers for UK
researchers and research institutions
Available to other systems (national and
international)
e.g. Names records exported to ISNI in 2011
Possible additional services
Disambiguation of existing data sets
Identification of external researchers
26. Access to Names
API allows for flexible searching of Names
data
EPrints plugin released in 2011: allows
repository users to choose from a list of
Names identities
…and to create a Names record if none exists
JISC Conference, 2010
29. Next steps…
JISC-convened Researcher ID group – final
meeting in September > recommendations
Options Appraisal Report for UK national
researcher identifier service > December
Improving data and adding new records
JISC Conference, 2010
30. Summing up
Names is a hybrid of library/publisher
approaches
Automated matching/disambiguation
Human quality checks
Data immediately available for re-use in other
systems
Researchers can supply information
31. An evolving area
Main challenges are cultural and political
rather than technical
National author/researcher ID services can be
important parts of research infrastructure
Getting agreement and co-ordination at
national level is vital
…and, I would say, are all very jealous of those countries with ready-made data sources like this…
Namey anecdote here? Dicky Moore & Robin Armstrong Viner?
Known in name authority circles as ‘the Siveter problem’
Every time we add a new data set, the quality of the data within the Names pilot improves – recently added information from the University of the West of England – QA process highlighted a previously unnoticed problem with the original Merit data.