The British Library published a Collection Metadata Strategy in 2015 to address challenges around legacy metadata and increase the value of its metadata assets. The strategy aims to make the Library's metadata comprehensive, coherent, authoritative and sustainable by 2020. This will be achieved by driving efficiencies, improving asset management, and increasing open access to metadata. Since 2015 the Library has created new processes to enhance e-book metadata, assessed metadata assets, added data to its linked open data platform, and increased the number of open metadata sets available.
Seldom do aspiring librarians predict that they will be the ones managing the intricacies of electronic resources. Yet, many are charged with complicated and unique tasks, like having to align resources in vendor knowledgebases. This can often be a confusing and frustrating process for librarians. This session will provide a brief overview of the KBART standard, its place in the electronic resource workflows, trends, and how librarians can avoid some common knowledgebase issues
This presentation was given by Noah Levin, KBART Standing Committee Co-Chair, at the NISO Annual Meeting and Standards Update on June 25. The event was held as a part of ALA Annual 2021.
This document summarizes information from a NISO meeting about the Transfer Code of Practice. It discusses:
- Version 5.0 of the code is in progress and will be released in March 2022 with updates around open access.
- The Keepers Registry for journal transfers is now hosted by the International ISSN Centre.
- An open teleconference in February featured discussions of frequently asked questions about the code.
- The code provides guidance for transferring and receiving publishers on communicating and transferring digital assets and rights between them in the case of a journal transfer.
20141030 LinDA Workshop echallenges2014 - LinDA project overviewLinDa_FP7
The LinDA project aims to provide tools for small and medium enterprises to access and analyze public sector information. The project will develop a transformation engine to convert data into semantic formats, a repository for linked data vocabularies, a linked data API, visualization tools, and analytics applications. These tools will help SMEs integrate public and private data sources to discover new patterns and develop innovative business models. The goal is to motivate more publication and use of open government data using semantic web standards.
This session will comprise a talk with a panel of speakers
looking at KBART: seven years later (since the publication
of the first set of recommendations up to today). The panel
will discuss the changes on the e-resources metadata
landscape, the benefits of KBART and the challenges of
its implementation. Today poor metadata in the electronic
resources supply chain is still a problem. The panel will
use practical examples to explain how metadata creation,
consumption and usage are marked by the constant
requirement of finding the balance between available
resources (technical and human) and end user discoverability
needs. The KBART Standing Committee sees the
implementation of KBART recommendations as a community
effort from a range of stakeholders (content providers,
knowledge bases, link resolvers and librarians).
Realizing An Assets Value: Is Only As Good As its MetadataAvenueCX
Without the proper information associated with a piece of content, it becomes inaccessible for future use and its future ROI is lost. Ensuring that assets and content are tagged appropriately and accurately is an essential function of stewards of an organization’s assets. This process starts with assessing all the pieces of relevant information about your content and developing a schema to ensure this tagging is done properly. This presentation shows you why and how.
Seldom do aspiring librarians predict that they will be the ones managing the intricacies of electronic resources. Yet, many are charged with complicated and unique tasks, like having to align resources in vendor knowledgebases. This can often be a confusing and frustrating process for librarians. This session will provide a brief overview of the KBART standard, its place in the electronic resource workflows, trends, and how librarians can avoid some common knowledgebase issues
This presentation was given by Noah Levin, KBART Standing Committee Co-Chair, at the NISO Annual Meeting and Standards Update on June 25. The event was held as a part of ALA Annual 2021.
This document summarizes information from a NISO meeting about the Transfer Code of Practice. It discusses:
- Version 5.0 of the code is in progress and will be released in March 2022 with updates around open access.
- The Keepers Registry for journal transfers is now hosted by the International ISSN Centre.
- An open teleconference in February featured discussions of frequently asked questions about the code.
- The code provides guidance for transferring and receiving publishers on communicating and transferring digital assets and rights between them in the case of a journal transfer.
20141030 LinDA Workshop echallenges2014 - LinDA project overviewLinDa_FP7
The LinDA project aims to provide tools for small and medium enterprises to access and analyze public sector information. The project will develop a transformation engine to convert data into semantic formats, a repository for linked data vocabularies, a linked data API, visualization tools, and analytics applications. These tools will help SMEs integrate public and private data sources to discover new patterns and develop innovative business models. The goal is to motivate more publication and use of open government data using semantic web standards.
This session will comprise a talk with a panel of speakers
looking at KBART: seven years later (since the publication
of the first set of recommendations up to today). The panel
will discuss the changes on the e-resources metadata
landscape, the benefits of KBART and the challenges of
its implementation. Today poor metadata in the electronic
resources supply chain is still a problem. The panel will
use practical examples to explain how metadata creation,
consumption and usage are marked by the constant
requirement of finding the balance between available
resources (technical and human) and end user discoverability
needs. The KBART Standing Committee sees the
implementation of KBART recommendations as a community
effort from a range of stakeholders (content providers,
knowledge bases, link resolvers and librarians).
Realizing An Assets Value: Is Only As Good As its MetadataAvenueCX
Without the proper information associated with a piece of content, it becomes inaccessible for future use and its future ROI is lost. Ensuring that assets and content are tagged appropriately and accurately is an essential function of stewards of an organization’s assets. This process starts with assessing all the pieces of relevant information about your content and developing a schema to ensure this tagging is done properly. This presentation shows you why and how.
This presentation was given by Tom Beyer of The Sheridan Group and Athena Hoeppner of The University of Central Florida, at the NISO Annual Meeting and Standards Update on June 25. The event was held as a part of ALA Annual 2021.
CILIP Conference - x metadata evolution the final mile - Richard WallisCILIP
Bibliographic metadata forms have evolved over centuries, the last 50 years in machine readable formats. The library community appears to be evolving from records, towards describing real-world entities using an agreed form of linked data. Is that step a step far enough to satisfy the ever-present need to aid discovery? Discovery in the environment of the approaching twenty first century’s 3rd decade. Or do we need to include a move into the landscape of globally understood structured data and knowledge graphs? The millennial environment of answer engines, mobile/local search and voice assistants.
#cilipconf19
The document summarizes an update on the NISO Open Discovery Initiative standards. It provides an overview of the ODI, which defines recommendations for data exchange between libraries, content providers, and discovery service vendors. The ODI aims to help libraries assess content provider participation in discovery services and ensure fair and unbiased indexing. It also outlines the roles and responsibilities of each party to ensure transparency and conformance with ODI practices. Recent updates to the ODI recommended practice in 2020 focused on metadata elements, fair linking, open access indicators, and statistical reporting.
The CTDA has seen significant growth in 2016, with digital assets increasing over 45% to over 412,547 assets. Records harvested also grew by over 43% to 49,923 records. New participants were added and functionality was expanded. Governance committees met regularly to discuss initiatives and projects. Education and training sessions were provided, including a user conference and workshops. The sites and systems performed reliably with over 98% uptime. Feedback from surveys was generally positive and highlighted areas for further improvement and reporting.
CILIP Conference - Diffusion of ISNIs into book supply chain metadata - Andr...CILIP
The presentation by Tim Devenport and Andrew MacEwan gives an introduction to the ISNI system and Member network and describe how the ISNI is linking library authority files with publisher supply chain metadata across multiple content industries. A case study shows how the use of ISNI in the British Library’s metadata opens up new opportunities for collaboration with the book publishing industry.
#cilipconf19
This document summarizes content workflow solutions from RightsDirect that help rightsholders and content users. It describes RightsDirect's licensing, document delivery, and training services for over 35,000 companies in 180 countries. The main solution highlighted is RightFind, a content workflow platform that allows users to find, get, purchase, share, and manage content. It provides tools for document delivery, licensing, content analytics, tagging and collaboration.
This document summarizes a workshop on data integration approaches and the PSICQUIC framework. It discusses data warehousing versus federated databases, describes web services protocols like SOAP and REST, and outlines the PSICQUIC registry and services for programmatically querying molecular interaction data from multiple sources using a common standard. Workflows for analyzing interaction data using myExperiment and Taverna are also mentioned.
Building the Global Open Knowledgebase (ER&L 2013)GOKb Project
The document discusses the Global Open Knowledgebase (GOKb) project, which aims to build an open, community-maintained knowledgebase of library data elements. It describes the three phases of the project, including developing a data model, using OpenRefine as a rules engine for data ingest, and building a prototype web application. It demonstrates how OpenRefine is used to clean and normalize library data as the first step for ingesting it into GOKb. The document outlines future plans to enhance OpenRefine and the GOKb web application, implement co-referencing services and APIs, and develop partnerships within the library community.
The document discusses the Integrated Archives and Manuscripts System (IAMS) project at the British Library. IAMS aims to deliver a unified catalog and discovery system for the Library's archival collections. So far, IAMS has migrated data from 40 systems into a single catalog with over 1 million records expected by 2011. A public interface called SOCAM provides search and discovery tools for archive collections. Future work includes enabling ordering and access to digital materials as well as developing standards-based data exchange.
Libraries are constantly under pressure to reduce content spending. In order to meet this challenge while continuing to serve their user’s real needs, the library needs to develop a deep understanding of content use across the organisation. This means augmenting plain usage data with details of organisational structure, and using that knowledge to choose optimal license and access strategies. By taking analytics a step deeper and leveraging a broad portfolio of access options, libraries can continue to support the research process while managing costs.
We will discuss the challenges of aggregating usage data, demonstrate a selection of analytical tools, and how the analysis can inform the buy decision as well as the justification (ROI) for it. Attendees will gain an overview of analytical techniques, learn about the trade-offs of build vs. buy strategies, and come away with an understanding of key success factors for implementing an analytics program.
Qwam Content Intelligence provides web monitoring and content analytics solutions through its AsknRead platform, which allows clients to monitor over 200,000 web sources in real time, select and share relevant content, and receive customized alerts and reports. Their solutions help users in fields like media monitoring, competitive intelligence, and marketing by extracting valuable information from online sources.
This document provides an overview and introduction to GOKb, a freely available community-managed knowledge base for electronic resource information. The goals are to introduce GOKb, discuss what it is and isn't, provide an overview of its innovative aspects, and how people can interact with and benefit from it. GOKb aims to solve problems with current inefficient and duplicative models by creating an open, standardized source for eresource management information that can be maintained and used by the broader library community.
GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba PeyIoan Toma
The document discusses an agenda for a meeting on benchmarking RDF and graph databases. It provides an overview of the LDBC benchmarking project including its objectives to create standardized benchmarks, spur industry cooperation, and push technological improvements. It outlines the working groups and task forces that will focus on specific types of benchmarks. It also discusses common issues to be addressed like suitable use cases, benchmark methodologies, and benchmark workloads for querying, updating, and integrating RDF and graph data. Open questions are raised about benchmark realism, benchmark rules, and technological convergence of RDF and graph databases.
The Global Open Knowledgebase (GOKb): open, linked data supporting library el...GOKb Project
This document provides an overview of the Global Open Knowledgebase (GOKb) project. GOKb aims to create a freely available, community-managed data repository containing publication information about electronic resources across the supply chain from publishers to suppliers to libraries. It will support key functions of the content lifecycle like selecting, licensing, managing and assessing resources. The goal is to solve problems around data duplication, quality and untangling across the industry. Current partners include libraries, vendors and publishers. Future plans include expanding coverage, adding linked data, and growing partnerships. GOKb could also enhance open access by providing standardized reference data and linking between subscribed and open collections.
The document provides information about Search Technologies, a leading independent IT services firm specializing in enterprise search and big data search solutions. It details their expertise in Microsoft Search, Google Search Appliance, and open source search technologies. It also describes their content processing framework Aspire, connectors for integrating various content sources, and query processing language QPL.
Open Data management is still not trivial nor sustainable - COMSODE results are here to bring automation to publication and management of Open Data in public institutions and companies. Presentation includes Open Data Ready standard proposal, three use cases and invitation for Horizon 2020 projects 2016.
Meeting the aims of Plan M and streamlining metadata workflows with the BDS A...CILIP MDG
This document introduces Bibliographic Data Services (BDS) and their new Academic Library Licence (ALL) which aims to streamline metadata workflows for academic libraries in accordance with Plan M. The ALL provides a subscription for the creation, curation, supply and sharing of metadata for library holdings and acquisitions. BDS will create high-quality MARC records adhering to standards like RDA, using their process of upgrading records from trade to confirmed status. The ALL supports automated workflows for ordering and receiving records to match library resources. It is presented as helping libraries reduce costs and carbon footprints through centralized cataloging activities.
OA Network: Heading for Joint Standards and Enhancing Cooperation: Value‐Adde...Stefan Buddenbohm
OA‐Network collaborates with other associated German Open Access‐related projects and pursues the overarching aim to increase the visibility and the ease of use of the German research output. For this end a technical infrastructure is established to offer value‐added services based on a shared information space across all participating repositories. In addition to this OA‐Network promotes the DINI‐certificate for Open Access repositories (standardization) and a regularly communication exchange in the German repository landscape.
RDM Roadmap to the Future, or: Lords and Ladies of the DataRobin Rice
Story of the new 2017-2020 University of Edinburgh RDM Roadmap, with a Tolkienesque theme for IASSIST-CARTO 2018 in Montreal: "Once upon a data point: sustaining our data storytellers".
This presentation was given by Tom Beyer of The Sheridan Group and Athena Hoeppner of The University of Central Florida, at the NISO Annual Meeting and Standards Update on June 25. The event was held as a part of ALA Annual 2021.
CILIP Conference - x metadata evolution the final mile - Richard WallisCILIP
Bibliographic metadata forms have evolved over centuries, the last 50 years in machine readable formats. The library community appears to be evolving from records, towards describing real-world entities using an agreed form of linked data. Is that step a step far enough to satisfy the ever-present need to aid discovery? Discovery in the environment of the approaching twenty first century’s 3rd decade. Or do we need to include a move into the landscape of globally understood structured data and knowledge graphs? The millennial environment of answer engines, mobile/local search and voice assistants.
#cilipconf19
The document summarizes an update on the NISO Open Discovery Initiative standards. It provides an overview of the ODI, which defines recommendations for data exchange between libraries, content providers, and discovery service vendors. The ODI aims to help libraries assess content provider participation in discovery services and ensure fair and unbiased indexing. It also outlines the roles and responsibilities of each party to ensure transparency and conformance with ODI practices. Recent updates to the ODI recommended practice in 2020 focused on metadata elements, fair linking, open access indicators, and statistical reporting.
The CTDA has seen significant growth in 2016, with digital assets increasing over 45% to over 412,547 assets. Records harvested also grew by over 43% to 49,923 records. New participants were added and functionality was expanded. Governance committees met regularly to discuss initiatives and projects. Education and training sessions were provided, including a user conference and workshops. The sites and systems performed reliably with over 98% uptime. Feedback from surveys was generally positive and highlighted areas for further improvement and reporting.
CILIP Conference - Diffusion of ISNIs into book supply chain metadata - Andr...CILIP
The presentation by Tim Devenport and Andrew MacEwan gives an introduction to the ISNI system and Member network and describe how the ISNI is linking library authority files with publisher supply chain metadata across multiple content industries. A case study shows how the use of ISNI in the British Library’s metadata opens up new opportunities for collaboration with the book publishing industry.
#cilipconf19
This document summarizes content workflow solutions from RightsDirect that help rightsholders and content users. It describes RightsDirect's licensing, document delivery, and training services for over 35,000 companies in 180 countries. The main solution highlighted is RightFind, a content workflow platform that allows users to find, get, purchase, share, and manage content. It provides tools for document delivery, licensing, content analytics, tagging and collaboration.
This document summarizes a workshop on data integration approaches and the PSICQUIC framework. It discusses data warehousing versus federated databases, describes web services protocols like SOAP and REST, and outlines the PSICQUIC registry and services for programmatically querying molecular interaction data from multiple sources using a common standard. Workflows for analyzing interaction data using myExperiment and Taverna are also mentioned.
Building the Global Open Knowledgebase (ER&L 2013)GOKb Project
The document discusses the Global Open Knowledgebase (GOKb) project, which aims to build an open, community-maintained knowledgebase of library data elements. It describes the three phases of the project, including developing a data model, using OpenRefine as a rules engine for data ingest, and building a prototype web application. It demonstrates how OpenRefine is used to clean and normalize library data as the first step for ingesting it into GOKb. The document outlines future plans to enhance OpenRefine and the GOKb web application, implement co-referencing services and APIs, and develop partnerships within the library community.
The document discusses the Integrated Archives and Manuscripts System (IAMS) project at the British Library. IAMS aims to deliver a unified catalog and discovery system for the Library's archival collections. So far, IAMS has migrated data from 40 systems into a single catalog with over 1 million records expected by 2011. A public interface called SOCAM provides search and discovery tools for archive collections. Future work includes enabling ordering and access to digital materials as well as developing standards-based data exchange.
Libraries are constantly under pressure to reduce content spending. In order to meet this challenge while continuing to serve their user’s real needs, the library needs to develop a deep understanding of content use across the organisation. This means augmenting plain usage data with details of organisational structure, and using that knowledge to choose optimal license and access strategies. By taking analytics a step deeper and leveraging a broad portfolio of access options, libraries can continue to support the research process while managing costs.
We will discuss the challenges of aggregating usage data, demonstrate a selection of analytical tools, and how the analysis can inform the buy decision as well as the justification (ROI) for it. Attendees will gain an overview of analytical techniques, learn about the trade-offs of build vs. buy strategies, and come away with an understanding of key success factors for implementing an analytics program.
Qwam Content Intelligence provides web monitoring and content analytics solutions through its AsknRead platform, which allows clients to monitor over 200,000 web sources in real time, select and share relevant content, and receive customized alerts and reports. Their solutions help users in fields like media monitoring, competitive intelligence, and marketing by extracting valuable information from online sources.
This document provides an overview and introduction to GOKb, a freely available community-managed knowledge base for electronic resource information. The goals are to introduce GOKb, discuss what it is and isn't, provide an overview of its innovative aspects, and how people can interact with and benefit from it. GOKb aims to solve problems with current inefficient and duplicative models by creating an open, standardized source for eresource management information that can be maintained and used by the broader library community.
GRAPH-TA 2013 - RDF and Graph benchmarking - Jose Lluis Larriba PeyIoan Toma
The document discusses an agenda for a meeting on benchmarking RDF and graph databases. It provides an overview of the LDBC benchmarking project including its objectives to create standardized benchmarks, spur industry cooperation, and push technological improvements. It outlines the working groups and task forces that will focus on specific types of benchmarks. It also discusses common issues to be addressed like suitable use cases, benchmark methodologies, and benchmark workloads for querying, updating, and integrating RDF and graph data. Open questions are raised about benchmark realism, benchmark rules, and technological convergence of RDF and graph databases.
The Global Open Knowledgebase (GOKb): open, linked data supporting library el...GOKb Project
This document provides an overview of the Global Open Knowledgebase (GOKb) project. GOKb aims to create a freely available, community-managed data repository containing publication information about electronic resources across the supply chain from publishers to suppliers to libraries. It will support key functions of the content lifecycle like selecting, licensing, managing and assessing resources. The goal is to solve problems around data duplication, quality and untangling across the industry. Current partners include libraries, vendors and publishers. Future plans include expanding coverage, adding linked data, and growing partnerships. GOKb could also enhance open access by providing standardized reference data and linking between subscribed and open collections.
The document provides information about Search Technologies, a leading independent IT services firm specializing in enterprise search and big data search solutions. It details their expertise in Microsoft Search, Google Search Appliance, and open source search technologies. It also describes their content processing framework Aspire, connectors for integrating various content sources, and query processing language QPL.
Open Data management is still not trivial nor sustainable - COMSODE results are here to bring automation to publication and management of Open Data in public institutions and companies. Presentation includes Open Data Ready standard proposal, three use cases and invitation for Horizon 2020 projects 2016.
Meeting the aims of Plan M and streamlining metadata workflows with the BDS A...CILIP MDG
This document introduces Bibliographic Data Services (BDS) and their new Academic Library Licence (ALL) which aims to streamline metadata workflows for academic libraries in accordance with Plan M. The ALL provides a subscription for the creation, curation, supply and sharing of metadata for library holdings and acquisitions. BDS will create high-quality MARC records adhering to standards like RDA, using their process of upgrading records from trade to confirmed status. The ALL supports automated workflows for ordering and receiving records to match library resources. It is presented as helping libraries reduce costs and carbon footprints through centralized cataloging activities.
OA Network: Heading for Joint Standards and Enhancing Cooperation: Value‐Adde...Stefan Buddenbohm
OA‐Network collaborates with other associated German Open Access‐related projects and pursues the overarching aim to increase the visibility and the ease of use of the German research output. For this end a technical infrastructure is established to offer value‐added services based on a shared information space across all participating repositories. In addition to this OA‐Network promotes the DINI‐certificate for Open Access repositories (standardization) and a regularly communication exchange in the German repository landscape.
RDM Roadmap to the Future, or: Lords and Ladies of the DataRobin Rice
Story of the new 2017-2020 University of Edinburgh RDM Roadmap, with a Tolkienesque theme for IASSIST-CARTO 2018 in Montreal: "Once upon a data point: sustaining our data storytellers".
The Jisc Research Data Discovery Service Project aims to build a UK research data discovery service that enables discovery of UK research data and meets requirements. Phase 2 will build on previous pilot work to lay foundations for the future delivery of the service, including developing use cases, agreeing metadata standards, and creating a business case. The project team is working with participating universities and data centers to ingest metadata and gather feedback to develop an effective solution.
Repositories unleashing data and Jisc projectsJisc RDM
Jisc supports the UK research process through developing shared infrastructure and standards. Two relevant projects are the UK Research Data Discovery Service and Research Data Metrics. The UK Research Data Discovery Service aims to make research data more discoverable by evaluating metadata aggregation models and developing a sustainable discovery service. It is currently in Phase 2 testing local and cross-institutional search. The Research Data Metrics project aims to assess data usage and develop a proof-of-concept tool to measure the effectiveness of research data management systems and inform the progression to a metrics service.
How to make your data count webinar, 26 Nov 2018ARDC
This document outlines the Make Data Count (MDC) initiative to standardize and promote the tracking of research data usage metrics. MDC has developed a Code of Practice for data usage logs, built an open hub to aggregate standardized usage data, and implemented tracking and display of usage metrics at their own repositories. They encourage other repositories to follow five simple steps to Make Their Data Count: 1) Read the Code of Practice, 2) Process usage logs, 3) Send logs to the hub, 4) Pull usage metrics from the hub, and 5) Display metrics. Future work includes outreach, iteration on implementations, and expanding metrics beyond DOIs.
ORCID - UK PIDs for Open Access - progress updateJisc
This document provides an update on progress with the UK PIDs for Open Access initiative. It discusses establishing a multi-consortium approach and governance model to promote unique identifiers like ORCID. A task force identified priorities like leadership support for mandates and outreach. A community survey highlighted barriers around metadata and integration costs. The next phases involve mapping optimal PID workflows and conducting a cost-benefit analysis to quantify potential benefits from metadata reuse and aggregation. A Research Identifier National Coordinating Committee is being established for community oversight and governance of these activities.
"Benchmarking of distributed linked data streaming systems" as presented in the Stream Reasoning Workshop 2018, January 16-17, 2018, held by Department of Informatics DDIS (University of Zurich) in Zurich, Suisse
This work was supported by grants from the EU H2020 Framework Programme provided for the project HOBBIT (GA no. 688227).
Data Library Services In The Data Stewardship LifecycleChuck Humphrey
The document discusses lessons learned from the development of data library services in Canada over the past 20 years. It summarizes that collections were a driving force behind introducing data services, with institutions collaborating to help establish data as a library resource. Training has also been important for continued participation in data initiatives and allowing differences across institutions. However, national priorities and a collective forum are still needed to fully address concerns around data access and preservation.
British Library Linked Open Data Presentation for ALA June 2014nw13
The document summarizes the British Library's experience in providing linked open data. It describes why the library offers linked data, what data is offered, and lessons learned. Key points include: the library offers metadata in various formats including RDF/XML and CSV to promote innovation, migration to standards, and collaboration; their linked data program has over 1,000 user organizations and 2 million transactions monthly; and lessons learned include understanding diverse user needs, continually improving data quality, and maintaining funding through measurable impact.
Lorraine Beard RDM at the University of ManchesterJisc
The University of Manchester has established a Research Data Management service and policy to support researchers in managing their research data. The RDM service was launched in 2011 and is a collaboration between the University Library and IT Services. It aims to provide guidance, tools, and infrastructure to help researchers comply with funder data sharing requirements and best practices for data management, storage, and preservation. Key challenges for the future include developing metadata standards, tools for data sharing and publishing, coordinating expertise across departments, and adapting to a changing research environment and funder landscape.
FAIR Data Interim Report and Action PlanSarah Jones
The document is an interim report from the European Commission Expert Group on FAIR Data. It provides recommendations and an action plan to make data FAIR (Findable, Accessible, Interoperable, Re-usable). The report defines key concepts of FAIR, recommends developing standards and components like identifiers, metadata and repositories to create a sustainable FAIR data ecosystem. It also recommends ensuring FAIR data and services, embedding a culture of FAIR practices, and developing metrics to assess progress. The action plan outlines next steps like consulting stakeholders on the recommendations and revising the report.
Research at risk: developing a shared research data management service for UK...Jisc RDM
Rachel Bruce presented on Jisc's plans to develop a shared research data management service for UK universities. The service aims to help universities meet research funder requirements for data management and sharing in a cost effective way. It will provide services such as storage, metadata, and tools to help with data discovery and reuse. Jisc conducted surveys that found universities wanted services for preservation, automation, integration, and reducing their IT burden. The shared service is being developed through 2017 based on requirements identified.
The JISC Continuing Access and Digital Preservation Strategy 2002-5, presentation to the 2004 JISC-CNI conference, Brighton UK is the fifth of 12 presentations I have selected to mark 20 years in Digital Preservation.
This presentation from 2004 is important largely for the legacy of the Strategy that established bodies such as the Digital Preservation Coalition and the Digital Curation Centre, which still have a major influence today.
The presentation sets out the context and rationale for the Strategy including the predicted growth of electronic publications, scientific data, and data curation. The implications of that growth were seen as:
• Core funding for institutions would not grow in line with information growth;
• A need for more automation and tools;
• A need for new shared services and information infrastructure;
• A significant need for R&D and investment to prepare for this.
Therefore the objectives of the Strategy were:
• As an advocacy document to secure additional funding of £6m over 3 years (2002-5) for new programmes in electronic records management and digital preservation;
• Justify the accompanying implementation plan;
• Provide a longer-term framework and rationale for activity extending beyond 2005.
Similar to Unlocking the value: a metadata strategy for the British Library / Alan Danskin (20)
UK Committee on RDA, RDA Day: New Tools for the Future of Cataloguing - Jenny...CILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Challenges to implementation - Jenny WrightCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Application Profiles in RDA - Jenny WrightCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
The Official RDA Toolkit - Opportunities for Efficiency - Thurstan YoungCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
The Official RDA Toolkit - Opportunities for Enrichment - Thurstan YouingCILIP MDG
The document discusses opportunities for enriching metadata in the Official RDA Toolkit. It provides background on extension plans, representative expressions, and data provenance. An example is given of recording an extension plan and representative expression for a multi-volume work. The extension plan vocabulary and representative expression elements are shown as ways to enrich RDA descriptions through structured, encoded values.
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
RDA methods, scenarios, tools - Gordon DunsireCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Poster: What’s in a name? Re-Discovering cataloguing and index through metada...CILIP MDG
In 2019 CILIP’s Cataloguing and Indexing Group changed its name to the Metadata and Discovery Group. This poster will showcase the transition of the look and feel of the group’s logo and the process of designing and new one.
Poster presented at the CILIP Metadata and Discovery Group (MDG) Conference & UKCoR RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham).
Poster: Revamping our in-house cataloguing training / Victoria Parkinson (Kin...CILIP MDG
With hybrid working and a new LMS, we are revamping our in-house cataloguing training. We are learning from our teaching librarians and using the tools we have, such as Moodle, to create cataloguing training that allows anyone with an interest to learn the basics and making the best use of face-to-face time for putting those skills into practice. Over the past eight years we’ve adapted and updated our in-house training, and I’ll also talk about how we decide what to teach colleagues, and how we try to make the best use of staff time to keep skills up when cataloguing is one of many competing priorities and shared across several teams. Between staff turnover and COVID lockdowns and service changes, we are starting almost from scratch in building a pool of staff who can catalogue the material our suppliers can’t provide records for, which is an excellent time to take stock of what our cataloguing needs are, and advocate for the importance of creating and upgrading good quality records and why we need to build these skills in-house.
Poster presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Poster: FAST : can it lighten the load, and what is the impact? / Jenny Wrigh...CILIP MDG
This poster presents the Faceted Application of Subject Terminology, giving an overview of the scheme, its advantages and potential issues, and its practical implementation. It will demonstrate that FAST is an important development for those interested in Linked Data, and the ways in which it is a useful tool for discovery in any system.
Poster presented at the CILIP Metadata and Discovery Group (MDG) Conference & UKCoR RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham).
Poster: The West Midlands Evidence Repository (WMER) : a regional collaborati...CILIP MDG
The West Midlands Evidence Repository (WMER) was born from a pre-pandemic recognition by managers of Knowledge and Library Services (KLSs) of 8 NHS Trusts in the West Midlands region of the need for a repository. This was to replace existing provision, or recognition of national priorities or local needs to record, collect, and share research, as well as potential for sharing patient information leaflets or guidelines. Some managers and services had previous experience of repositories, as well as being part of a national pilot. WMER, however, represented a new start for all to work in collaboration to establish a new service. The consortium would enable sharing of both costs and experience.
Initially, different repository suppliers were investigated by the KLS that had had a long-established repository, taking on board the experience of the group from the national pilot. The Atmire Open Repository platform was chosen as it met the consortium’s needs and had a proven track record of other collaborative repositories in the NHS. Financing was taken on by one Trust and the on-boarding was led in partnership between that Trust and the Trust that had undertaken the initial investigation.
With the initial on-boarding completed and the test server set-up, the group took a step back to ensure they worked together as a collaborative going forward. Collaborative work between the KLSs was facilitated by the formal creation of two groups, a Managers Group for overall approval and financial decision making and an Operational Group handling the setup and administration of the repository for the consortium. The Operational Group is led by the service with most experience of managing repositories and the lead of it acts as liaison between the two groups, with each group having representation from the eight organisations. Learning from other regional collaborations the Future NHS site was used as a collaborative workspace and Teams as the main means of communication.
The setup of the repository was completed on time after three months. There was initially a steep learning curve for all, especially the Operational Group who undertook this process. The group identified key metadata and metadata standards for the repository, including the use of ORCIDs and the use of Wessex Classification as a controlled vocabulary. The setup process was facilitated by the collaborative nature of the project as the variety of experience in the group was a great benefit. It should be noted support from the suppliers was specifically related to technical support only.
The collaborative nature of the project also allowed work to be shared, and tasks were given to members to be undertaken independently. However, a downside of collaborative projects is that decisions can take longer to be inclusive...
Poster presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Poster: Updating the Wessex Classification Scheme for UK health libraries : a...CILIP MDG
The Wessex Classification Scheme was created by healthcare librarians in the South West of England, and was loosely based on the US National Library of Medicine classification. The scheme is widely used in healthcare libraries across the UK, both inside and outside the NHS. Although the scheme has gone through several revisions, there has been no major update since 2015, so the Wessex Classification Scheme Oversight Group was formed in September 2022 with the support of NHS England. The group aims to bring knowledge and skills from UK health library networks to improve the scheme and offers a chance for participants to develop skills in working with classification and subject indexing, and the opportunity to network widely. By forming a working group, it ensures the longevity of the scheme and shares the maintenance work more widely.
Initially, members were asked which parts of the scheme they felt needed updating the most and sub-groups were formed for LGBTQ+ issues and gender identity (the Pride sub-group), Ethnicity and Race, and Learning Disability and Neurodiversity (the LDN sub-group) as well as a smaller team working on ‘quick and simple’ updates....
Poster presented at the CILIP Metadata and Discovery Group (MDG) Conference & UKCoR RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham).
Revamping in-house cataloguing training / Victoria Parkinson (King's College ...CILIP MDG
With hybrid working and a new LMS, we are revamping our in-house cataloguing training. We are learning from our teaching librarians and using the tools we have, such as Moodle, to create cataloguing training that allows anyone with an interest to learn the basics and making the best use of face-to-face time for putting those skills into practice. Over the past eight years we’ve adapted and updated our in-house training, and I’ll also talk about how we decide what to teach colleagues, and how we try to make the best use of staff time to keep skills up when cataloguing is one of many competing priorities and shared across several teams. Between staff turnover and COVID lockdowns and service changes, we are starting almost from scratch in building a pool of staff who can catalogue the material our suppliers can’t provide records for, which is an excellent time to take stock of what our cataloguing needs are, and advocate for the importance of creating and upgrading good quality records and why we need to build these skills in-house.
Lightning Talk presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
UK NACO funnel : progress, obstacles, and solutions / Martin Kelleher (Univer...CILIP MDG
This Lightning Talk will provide a quick update on latest progress with the now established UK NACO Funnel, which allows participating institutions to contribute to Library of Congress / PCC authority control. The presentation will include a summary of the purpose of the funnel, details of latest expansion, problems and solutions with data submission software, and further plans and collaborations.
Lightning Talk presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Ship[w]right[e]s? : the challenges of cataloguing reports from scientific exp...CILIP MDG
This document discusses the challenges of cataloguing reports from scientific expeditions, using the Challenger Reports as an example. It notes that there were 83 Challenger Reports published by 47 authors, held across 4 sites in 50 different sectional libraries, with 220 bib records created for this work. It also mentions the opportunity for the Natural History Museum to think about metadata across the entire museum collection as part of an effort to move specimens to a new location.
BFI Reuben Library : an RDA implementation story / Anastasia Kerameos (BFI Re...CILIP MDG
“From 1st January 2024, Adlib will no longer be supported or maintained by Axiell.” This statement acted as the catalyst for action, enabling the release of resources to implement significant changes to the BFI Reuben Library’s record structure, which in turn prompted a deeper look into our current cataloguing practices and future requirements.
Upgrading to Axiell Collections will allow the library to implement new RDA more fully – we had previously adopted some aspects but not all – and, importantly, it will allow us to better align our data structure with that of the organisation’s other collections, making it easier to manage and making it compatible with further planned system developments. By the time of the conference in September we will be cataloguing to an under the bonnet Work – Expression – Manifestation – Item (WEMI) record hierarchy and new cataloguing guidelines.
Having watched all the webinars available, having read every piece of documentation which seemed relevant, having spent hours reading and re-reading the contents of the RDA Toolkit we are currently working on the last stages of our application profile whilst still debating issues around putting the theory into practice, especially in the area of aggregates and diachronic works. I do not suggest I have all the answers, far from it, but by sharing the story of our journey, that of a medium sized non-academic library of specialist mostly print collections and illustrating it with practical examples I hope my presentation will be of use to others currently travelling a similar path.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
RDA implementation at the British Library / Thurstan Young (British Library)CILIP MDG
On 23rd May 2023, the RDA Board announced that the original RDA Toolkit will be removed in May 2027. All RDA users will need to be prepared for transition to the official RDA Toolkit before then. As previously announced, a Countdown Clock will start running in May 2026, a year before the sunset date.
This paper will provide an update on the British Library’s plans for implementation of the new RDA Toolkit, following completion of the RDA Toolkit Restructure and Redesign (3R) project. It will provide an overview of the timeline and scope for implementation as well as describing the training and documentation underpinning the implementation and the support available to other institutions for their implementation.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Community forward : developing descriptive cataloguing of rare materials (RDA...CILIP MDG
Since 2013, Resource Description and Access (RDA) has been the chief cataloguing standard used in the United States. In 2019, the RDA Steering Committee previewed a new version of the RDA Toolkit, which introduced substantial changes, such as replacing instructions with a series of options, adding new concepts such as “nomens” and “diachronic works,” and replacing the prior organisation with a broader intellectual framework. This revised Toolkit became the official RDA Toolkit in December 2020, with major cataloguing bodies planning to adopt it in the coming years. Some cataloguers have expressed concerns regarding the official RDA Toolkit, particularly around cost and training required to learn the new standard.
In response to these concerns, the RBMS RDA Editorial Group, a group of volunteers from the Association of College and Research Libraries’ Rare Books and Manuscripts Section, developed a new manual, Descriptive Cataloging of Rare Materials (RDA Edition). DCRMR is informed by core principles of community and sustainability while employing open-access publication models and infrastructure. Designed in response to community feedback, it presents instructions in cataloguing workflow order using clear language while remaining aligned to the official RDA Toolkit and RDA element sets. The manual was approved in February 2022 in its first iteration and continues to be actively developed and updated. This presentation will discuss why the editorial group created an open and free manual; the process and tools for creating the manual, including the use of GitHub to publish a cataloguing standard; and outcomes to date.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
The West Midlands Evidence Repository (WMER) : a regional collaboration proje...CILIP MDG
The West Midlands Evidence Repository (WMER) was born from a pre-pandemic recognition by managers of Knowledge and Library Services (KLSs) of 8 NHS Trusts in the West Midlands region of the need for a repository. This was to replace existing provision, or recognition of national priorities or local needs to record, collect, and share research, as well as potential for sharing patient information leaflets or guidelines. Some managers and services had previous experience of repositories, as well as being part of a national pilot. WMER, however, represented a new start for all to work in collaboration to establish a new service. The consortium would enable sharing of both costs and experience.
Initially, different repository suppliers were investigated by the KLS that had had a long-established repository, taking on board the experience of the group from the national pilot. The Atmire Open Repository platform was chosen as it met the consortium’s needs and had a proven track record of other collaborative repositories in the NHS. Financing was taken on by one Trust and the on-boarding was led in partnership between that Trust and the Trust that had undertaken the initial investigation.
With the initial on-boarding completed and the test server set-up, the group took a step back to ensure they worked together as a collaborative going forward. Collaborative work between the KLSs was facilitated by the formal creation of two groups, a Managers Group for overall approval and financial decision making and an Operational Group handling the setup and administration of the repository for the consortium. The Operational Group is led by the service with most experience of managing repositories and the lead of it acts as liaison between the two groups, with each group having representation from the eight organisations. Learning from other regional collaborations the Future NHS site was used as a collaborative workspace and Teams as the main means of communication.
The setup of the repository was completed on time after three months. There was initially a steep learning curve for all, especially the Operational Group who undertook this process. The group identified key metadata and metadata standards for the repository, including the use of ORCIDs and the use of Wessex Classification as a controlled vocabulary. The setup process was facilitated by the collaborative nature of the project as the variety of experience in the group was a great benefit. It should be noted support from the suppliers was specifically related to technical support only.
The collaborative nature of the project also allowed work to be shared, and tasks were given to members to be undertaken independently. However, a downside of collaborative projects is that decisions can take longer to be inclusive...
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Authority of assertion in repository contributions to the PID graph / George ...CILIP MDG
The principles surrounding Linked Open Data and their implementation within digital libraries are well understood. Such implementations may be challenging, but successes are now well documented and continue to demonstrate the benefits of disseminating and enriching existing metadata with improved semantics and relational associations. Often facilitated in machine-readability enhancements to metadata by harnessing serializations of the Resource Description Framework (RDF) and its reliance of URIs, these LOD approaches have ensured digital libraries, and similar GLAMR initiatives elsewhere, contribute to the growing knowledge graphs associated with the wider semantic web by declaring statements of fact about web entities. Within open scholarly ecosystems a growing use of persistent identifiers (PIDs) to define and link scholarly entities has emerged, e.g., DOIs, ORCIDs, etc. The requirement for greater URI persistence has been motivated by several developments within the scholarly space; suffice to state that, when combined with appropriate structured data, PIDs can support improvements to resource discovery, as well as facilitate contributions to the ‘PID graph’ – a scholarly data graph describing and declaring associative relations between scholarly entities.
While the increased adoption of PIDs has the potential to transform scholarship, ensuring that these PIDs are used appropriately, encoded correctly within metadata, and that all relevant relational associations between scholarly entities are declared presents challenges. This is especially true within open scholarly repositories, from where many contributions to the PID graph will be made but – unlike many LOD contexts – from where the authority to assert specific relations may not always exist. Such declarations need to demonstrate reliability and provenance and are central to the interlinking of heterogeneous textual objects, datasets, software, research instruments, equipment, and the related PIDs these items may generate, such as for people, organizations, or other abstract entities.
This paper will explore the issues that arise when levels of authority to assert are lacking or are uncertain, and review results from a related study exploring the ‘PID literacy’ of scholars...
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann
06-18-2024-Princeton Meetup-Introduction to Milvus
tim.spann@zilliz.com
https://www.linkedin.com/in/timothyspann/
https://x.com/paasdev
https://github.com/tspannhw
https://github.com/milvus-io/milvus
Get Milvused!
https://milvus.io/
Read my Newsletter every week!
https://github.com/tspannhw/FLiPStackWeekly/blob/main/142-17June2024.md
For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here
https://www.youtube.com/@MilvusVectorDatabase/videos
Unstructured Data Meetups -
https://www.meetup.com/unstructured-data-meetup-new-york/
https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7
https://www.meetup.com/pro/unstructureddata/
https://zilliz.com/community/unstructured-data-meetup
https://zilliz.com/event
Twitter/X: https://x.com/milvusio https://x.com/paasdev
LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/
GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw
Invitation to join Discord: https://discord.com/invite/FjCMmaJng6
Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann
Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
3. www.bl.uk 3
British Library Metadata Services
The British Library Act records
our role as “national centre for…
bibliographical & other information services”
BL Metadata Services:
• Originally offered priced services &
evolved through many technologies
• Began to offer open data in 2010 &
Linked Open Data in 2011
• Collection Metadata Strategy published
in 2015
7. www.bl.uk 7
Legacy Metadata Challenges
0
20
40
60
80
100
120
2013 2014
% Records containing Language of Content
Foundation catalogues Integrated Catalogue Annual Production
Discovery
Research
Bibliometrics
Management Information
Visualisation
Collection Management
11. www.bl.uk 12
Collection Metadata - Uses
Internal External
Direct
To record:
Resource description & availability for
discovery & access
Collection inventory
Preservation requirements
Licensed content rights
Legal deposit & purchased claims
Representation of BL holdings in shared
catalogues
Re-use in 3rd party commercial services
Derived cataloguing by other libraries
Identification of candidate material for
collaborative digitisation
Indirect
Preparing exhibitions
Responding to FOI requests
Website organisation
Content identification for collaborative
initiatives
Management information
Confirmation of UK publication status
Identification of last resort copy for
collection disposal initiatives
Open data contribution
19. www.bl.uk 20
Communications
• Strategy published externally
• Established & communicated
best practice via new CM Wiki
• Implemented:
• Centralised support mailbox
• Horizon scanning function
• Workshop on 2016/17 plans
21. www.bl.uk 22
Vision
“Our vision is that by 2020
the Library’s collection
metadata assets will be
comprehensive, coherent,
authoritative and
sustainable, enabling their
full value to be unlocked for
improved content
management, greater
collaboration and wider use
of the collection.”
22. www.bl.uk 24
Objectives
• Drive efficiencies in the creation, management and
exploitation of collection metadata to support delivery of
the Library’s strategic priorities and programmes
• Improve the Library’s return on investment in its
collection metadata assets by ensuring their long term
value is maintained for future activities
• Open up more of the Library’s collection metadata to
improve access to Library content and promote wider re-use
23. www.bl.uk 25
Collection Metadata Strategy
2015-18 Implementation Roadmap
MetadataManagementStandards
Preservation, Maintenance & Enhancement Open Metadata Discovery & Delivery
2015 2016 2017 2018 2019 2020
Metadata assets
register available
ISNI integration into key
metadata
Licensing&RightsManagementProcessEfficienciesCommunications
Technical Infrastructure
Collection metadata available via
global cross domain delivery channels
Assessment and
prioritisation of
enhancement requirements
Automated E-book
metadata enhancement
process implemented
Establish horizon scanning
function for metadata
Undertake Options appraisal for linked
metadata platform replacement
Aleph v22
Implemention
Metadata strategy
workshop
Development of new
metadata visualisation
tools
Redevelopment of AMED
Service metadata
Undertake investigation of
DRM metadata solutions
Convergence on agreed metadata standards portfolio
Standardise solutions
for derived collection
metadata
Internal
communication of
strategic priorities
Completion of
website
redevelopment
Large scale batch
enhancement solution
implemented
Harmonisation of
existing metadata
licensing practice
IAMS data release
Create metadata based
collection analysis tools
Undertake review of national library
metadata systems options
Non standard systems
migration completed
Assessment and prioritisation
of metadata systems
migration requirements
All collection metadata held in
centralised master metadata
repositories
SAMI Metadata
assessment
Persistent Identifier
infrastructure for metadata and
related content
Implementation of comprehensive
DRM metadata solution
Promotion of agreed
standards portfolio
Preparatory work for ILS
replacement
Ongoingrationalisationoflegacysystems&migrationtosupportedmastermetadatarepositories
Improved user access
options available
Implementation of
PSI directives for
metadata sets
Complex rights & license metadata
solution available
New staff intranet
and wiki resources on
collection metadata
Optimised metadata management
processes
Efficient international standards
engagement process
Undertake review of new
metadata creation &
processing options
Establish annual
collection metadata
audit processes
Collaborate with international partners on core metadata standards
Ongoinginfrastructuredevelopment
Migration to
supported standards
BNB Linked data service
improvements
Printed music metadata
release
New open researcher format
metadata capability
established
Fully representative range
of metadata sets available
Hidden collection metadata
assets exposed for re-use
Centralised metadata
license storage
implemented
Undertake audits of
metadata assets & licenses
Governance &
internal support
functions established
Investigate deriving
options for sound
recordings
Migration complete
Collection Metadata
best practice resource
available for staff
New linked data platform
implemented
JISC national shared
metadata service platform
options appraisal
External publication
of strategy
External user support
functions centralised
Preparatory work for
Google digitisation
phase 2
Implementation of
new external sources
metadata
Linked data exploitation
options implemented
Undertake review of metadata
standards & develop engagement plan
Investigate BL on
Demand requirements
Comprehensive open
metadata service offering
Unified, standardised
metadata management
infrastructure
26. www.bl.uk 28
Efficiencies
Reducing Cost & Complexity
We created new:
• E-publisher metadata
assessment & ingest processes
• Automated enhancement
processes for Western
European Languages & e-book
metadata
• Spread sheet data capture &
crowd source workflows
• Core metadata specification for
digitisation initiatives
• FAST Consultation
27. www.bl.uk 29
Asset Management
Maintaining Long Term Value
We implemented:
• Collection metadata audit &
assets register
• Centralised metadata licensing
store
• Metadata impact assessment
for business programmes
• Record of ‘hidden’ metadata
assets
% Metadata in
Strategic
Repositories
%Metadata in Other
Repositories
52.7
47.3
59.9
40.1
2014-15 2015-16
28. www.bl.uk 30
Open Metadata
Increasing Access, Reuse & Relevance
0
50
100
150
200
250
300
350
400
450
500
2014/15 2015/16
Downloads
177 countries use services
New open ‘researcher format’ option
Created open metadata sets
for printed music, manuscripts &
archives
Events – CILIP,
Cabinet Office Data Science…
Student research
collaborations
1550+ users
29. www.bl.uk 31
Open Metadata
Linked Open Data
Linked Data Analytics Project
• Who is using our data?
• Which data?
• How to optimise publication?
Visitors - Government
Visitors - Academia
Visitors - Libraries
745K ISNIs added to Linked Data BNB
Apologies if the colour is a bit blinding. Metadata is one of those things that people take for granted until it goes wrong, so we wanted something eye catching. And the future is bright for metadata.
The British Library is the U.K.’s national library and also a legal deposit library. The Library currently has around 1,500 staff operating on three sites in Stockton on Tees, Boston Spa and London. The main cataloguing department is based in Boston Spa, but there are many specialist cataloguers in London
The Library was created in 1973 as the result of an Act of Parliament to amalgamate many existing institutions. Under the Act, the Library is “the national centre for bibliographical & other information services”
From an external perspective, the Library has charged for many of its bibliographic services, but over recent years in line with government policy, barriers to access have been removed. We began to offer open data in 210 and Linked Open Data form 2011.
In 2015 we published our first Collection Metadata Strategy. I’ll come to what the strategy actually is, a bit later, but first, why did we decide that we needed a strategy.
“Why” was the hardest question to answer. In fact, it needed a separate document to justify the need for a strategy.
The Library faces new challenges. In 2013 the scope of legal deposit was extended to non-print material. This means that publishers can be switched from deposit of printed books and serials to deposit of the e-book or e-journal. The rate of transition is under the Library’s control, which enables us to develop our infrastructure. However, we are already seeing interesting challenges. Based on the contribution from a very small subset of publishers over May 2015 to January 2016 we received 50,000 e books. This corresponds to about 50% of annual intake of printed books. Closer analysis of what has been deposited shows that we are receiving materials which are in effect re-issues of printed books and because of the international scope of publishing, we are receiving materials which have no UK imprint, but are distributed in UK. For much of this additional material we may already have records for the print manifestation, which we should be able to reuse and adapt.
There are also long standing challenges. We still have printed catalogues. Around 2 million items can only be found by consulting a printed catalogue in the reading room. There is a smaller but unquantified number of catalogues and finding aids on electronic media that need to be migrated to strategic repositories. There are some collections that remain uncatalogued.
A pervasive problem is that a lot of legacy metadata, particularly retrospectively converted metadata falls far short of current standards.
Here is legacy meetadata: this record was retrospectively converted from the British Museum catalogue. As the cataloguers among you will recognise, there are some omissions. Where is the language of content? Where is the country of publication? Where is the subject data? This information wasn’t recorded in the original catalogue. Taking language as an example, the orange line shows that for current processing 100% of published resources are coded for language of content. However, if we look at the foundation or legacy catalogues, this drops to under 30%, which has implications for services and for the metadata to inform research and collection management or to visualise the collection. How can you build a linguistic picture of the collection, if the language is not explicitly recorded. This metadata is not capable of answering the questions posed by users and staff.
Metadata is increasing in importance but it was recognised that within the organization responsibility for metadata was divided and the boundaries were unclear. Addressing the challenges would be impossible unless we could also break down silos.
This is a functional overview of our metadata architecture. Metadata for various resource types is sourced externally or created internally and comes in a variety of schema. The library manages the metadata and it supports discovery and we disseminate it through various channels. It looks relatively straightforward, but if we overlay the systems we reveal a lot of complexity.
In effect each content stream has its own input format and its own system. Each content stream is therefore a distinct silo, usually with its own workflow and specialist staff.
The persistence of silos at the discovery layer prevent cross searching of the whole collection but the complexity they create at every level is a barrier to efficiency.
In 2004 we implemented Aleph, which removed significant long standing silos based around printed collections. This has created much more uniform metadata and more efficient processes, but many content streams and processes were out of scope and thus silos have been perpetuated.
To some extent these silos reflect historic divisions between different components of the Library and responsibility has been distributed between different services and departments.
“Why” was the hardest question to answer. In fact, it needed a separate document to justify the need for a strategy.
The following diagram gives an overview of the range of external stakeholders
And internal stakeholders who are increasingly dependent on metadata to deliver services, and the Library’s strategic objectives.
The fundamental idea behind the evolving strategy was to treat Collection Metadata as a strategic asset of the Library, on a par with the collection, the staff and the estate. We deliberately limited the scope to metadata about the collection. Collection metadata identifies attributes and relationships of collection resources; location and availability of collection resources and status and rights associate with them
Like the collection or any other asset collection metadata needs careful stewarship over time and this in turn needs clear leadership and adequate resourcing.
Like any other asset, investment should be rewarded by improvements in efficiency, better services and more value for money. An important contention is that the metadata should be made to work much harder.
“Why” was the hardest question to answer. In fact, it needed a separate document to justify the need for a strategy.
The vision provides a view of where we want to be in 2020, which aligns closely with the Library’s Strategy, Living Knowledge and its core programmes.
There are three core objectives:
Drive efficiencies in the creation, management and exploitation of collection metadata to support delivery of the Library’s strategic priorities and programmes
Improve the Library’s return on investment in its collection metadata assets by ensuring their long term value is maintained for future activities
Open up more of the Library’s collection metadata to improve access to Library content and promote wider re-use
And a plan…
So where have we got to.
We have raised the awareness of collection metadata within the institution. We are also better informed of new requirements and receive earlier notice which allows us to influence as well as respond.
It is clear that metadata is central to the delivery of many corporate strategic objectives. This has increased demand, but also substantiates bids for additional resource.
Convergence creates new synergies. The vision to bring our metadata together informs decisions about system architecture.