MarcEdit task lists and vendor-supplied metadata : revisiting the subscriber-publisher relationship at the University of Leeds / Ceilan Hunter-Green (University of Leeds).
Like many institutions, the University of Leeds began purchasing and subscribing to streaming video services during the Covid lockdowns, including the new offering of a British Film Institute institutional subscription. This BFI subscription offered an excellent selection, but there were barriers to discoverability and analytics with no vendor-supplied records and a high staffing cost for manual creation of individual records. Bare bones scratch records could be created quickly with the limited metadata provided by the vendor every month, but then had to be manually supplemented with copied metadata from the streaming platform BFI Player.
After about 18 months of this labour-intensive arrangement, a chance conversation with a colleague prompted a re-examination of our vendor-subscriber partnership. Instead of creating records just for our institution, why not share them with other subscribers and get a discount for our subscription?
A MarcEdit task list was developed that would enable creation of full RDA-compliant MARC records on the condition of BFI supplying a .csv file of comprehensive metadata from BFI Player. It was fantastic learning opportunity to develop skills in MarcEdit and to update the team’s knowledge of video streaming cataloguing. Much of this learning was done through the Library Juice Academy’s video streaming and MarcEdit courses, as well as consulting the NISO video and audio metadata guidelines to ensure that the records we would provide to our community would be the most comprehensive possible. Just as we’d hoped, this new arrangement allowed the negotiation of a subscription discount for the University in exchange for sharing these monthly addition and deletion records with other subscribing institutions at no extra cost.
While the task list creation process was a technical challenge, the community impact of the new arrangement has great potential to benefit our fellow subscribing institutions. Subscribers are now receiving records for individual films rather than relying on a single platform record, which will allow for greater analysis of collection usage, direct reading list linking for fellow academic institutions, and improved accessibility faceting through the discovery layer with the newly generated 341 and 655 fields. This presentation will serve both as a practical demonstration of MarcEdit task lists and regular expressions to normalise and enhance vendor metadata - including populating the 008 field with production date, runtime and language information, creating conditional 655 fields for Short/feature film and Fiction/nonfiction film, and adding enhanced accessibility fields for closed captioning and audio descriptions in the 341, 532 and 655 fields - and as an exploration of the potential for institutions with greater staffing power to facilitate community access to vendor content.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
RD shared services and research data springJisc RDM
Daniela Duca's presentation at the DataVault workshop on 29 June. An overview of research at risk, research data shared service and research data spring.
Uk Research Infrastructure Workshop E-infrastructure Juan BicarreguiInnovate UK
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
How to build a successful EU project
by Juan Bicarregui
Scientific Computing Department STFC
Amy Devenney (JISC), Beth Harris (JISC)
This session will detail the process implemented with 13 publishers to collect article-level metadata on open access publications for Jisc transitional agreements throughout 2020 and discuss the challenges encountered. It will also demonstrate how the data collected has allowed Jisc to effectively monitor and evaluate transitional agreements and conclude by outlining recommendations to improve the transparency of the transition to open access.
RD shared services and research data springJisc RDM
Daniela Duca's presentation at the DataVault workshop on 29 June. An overview of research at risk, research data shared service and research data spring.
Uk Research Infrastructure Workshop E-infrastructure Juan BicarreguiInnovate UK
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
How to build a successful EU project
by Juan Bicarregui
Scientific Computing Department STFC
Amy Devenney (JISC), Beth Harris (JISC)
This session will detail the process implemented with 13 publishers to collect article-level metadata on open access publications for Jisc transitional agreements throughout 2020 and discuss the challenges encountered. It will also demonstrate how the data collected has allowed Jisc to effectively monitor and evaluate transitional agreements and conclude by outlining recommendations to improve the transparency of the transition to open access.
Laura Wong - Jisc
In the UK, the increase in transitional agreements (TAs) has prompted us to ask new questions about how we measure the impact of the transition to OA, the performance of agreements, and the metrics we need. With COUNTER usage reports, we expect to see a shift in interest to global usage for individual research outputs. In this presentation, we cover:
• Drivers, opportunities and challenges in open access usage reporting for libraries and consortia such as Jisc
• Roles of publisher and institutional repository usage statistics
• How COUNTER 5.1 supports this work
• Next steps, lessons learned and the practical takeaways
Intro to buildingsmart and COBie - Nick Tune at Ecobuild 2015The NBS
Nick Tune joined us at Ecobuild 2015, and kicked off our selection of BIM seminars with Introduction to Buildingsmart and COBie - you can now see the slides here!
The following presentations were delivered at the the Local Waste Services Standards Beta Showcase on 22 February 2016:
Linda O'Halloran, Product Owner, gave an overview of the project objectives and roadmap and resources delivered so far.
Sarah Prag, Service Design project Lead, gave a business case update - an overview of the approach, headline numbers, the bigger picture and how to feed in
Paul Mackay, Technical Lead, talked about open standards, taxonomies and APIs
HELEN DOBSON, KIRA BRAYMAN
Jisc
This breakout will provide an opportunity for attendees to delve deeper into the findings of the Critical Review of Transitional Agreements discussed by Chris Banks and Caren Milloy in the second plenary session. We will discuss the methodology in more detail, as well as elaborate on our findings on the prevalence of Open Access and the extent to which UK transitional agreements have met the sector’s requirements. We will also ask several questions of the audience to help us gauge the UKSG community’s reactions to the findings and ambitions for the future of open research dissemination.
Developing a persistent identifier roadmap for open access to UK researchJisc
The Jisc/UKRI PID roadmap project and report (developing a persistent identifier roadmap for open access to UK research).
Hilda Muchando, senior information policy officer, Jisc.
A presentation at Jisc's persistent identifiers and open access in the UK: the way forward online event on 25 June 2020.
Easy SPARQLing for the Building Performance ProfessionalMartin Kaltenböck
Slides of Martin Kaltenböcks (SWC) presentation at SEMANTiCS2014 conference in Leipzig on 5th of September 2014 about the 'Tool for Building Energy Performance Scenarios' of GBPN (Global Buildings Performance Network, http://gbpn.org) that provides a prediction tool for buildings performance worldwide by making use of Linked Open Data (LOD).
OpenAIRE services and tools - presentation at #DI4R2016OpenAIRE
Presentation at Digital Infrastrctures for Research Conference 2016 (Sept. 30). Title: Open Access and Open Data in Horizon 2020: for Research managers and Project Coordinators, by Pedro Príncipe (University of Minho)
UK Committee on RDA, RDA Day: New Tools for the Future of Cataloguing - Jenny...CILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Challenges to implementation - Jenny WrightCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
More Related Content
Similar to MarcEdit task lists and vendor-supplied metadata : revisiting the subscriber-publisher relationship at the University of Leeds / Ceilan Hunter-Green (University of Leeds).
Laura Wong - Jisc
In the UK, the increase in transitional agreements (TAs) has prompted us to ask new questions about how we measure the impact of the transition to OA, the performance of agreements, and the metrics we need. With COUNTER usage reports, we expect to see a shift in interest to global usage for individual research outputs. In this presentation, we cover:
• Drivers, opportunities and challenges in open access usage reporting for libraries and consortia such as Jisc
• Roles of publisher and institutional repository usage statistics
• How COUNTER 5.1 supports this work
• Next steps, lessons learned and the practical takeaways
Intro to buildingsmart and COBie - Nick Tune at Ecobuild 2015The NBS
Nick Tune joined us at Ecobuild 2015, and kicked off our selection of BIM seminars with Introduction to Buildingsmart and COBie - you can now see the slides here!
The following presentations were delivered at the the Local Waste Services Standards Beta Showcase on 22 February 2016:
Linda O'Halloran, Product Owner, gave an overview of the project objectives and roadmap and resources delivered so far.
Sarah Prag, Service Design project Lead, gave a business case update - an overview of the approach, headline numbers, the bigger picture and how to feed in
Paul Mackay, Technical Lead, talked about open standards, taxonomies and APIs
HELEN DOBSON, KIRA BRAYMAN
Jisc
This breakout will provide an opportunity for attendees to delve deeper into the findings of the Critical Review of Transitional Agreements discussed by Chris Banks and Caren Milloy in the second plenary session. We will discuss the methodology in more detail, as well as elaborate on our findings on the prevalence of Open Access and the extent to which UK transitional agreements have met the sector’s requirements. We will also ask several questions of the audience to help us gauge the UKSG community’s reactions to the findings and ambitions for the future of open research dissemination.
Developing a persistent identifier roadmap for open access to UK researchJisc
The Jisc/UKRI PID roadmap project and report (developing a persistent identifier roadmap for open access to UK research).
Hilda Muchando, senior information policy officer, Jisc.
A presentation at Jisc's persistent identifiers and open access in the UK: the way forward online event on 25 June 2020.
Easy SPARQLing for the Building Performance ProfessionalMartin Kaltenböck
Slides of Martin Kaltenböcks (SWC) presentation at SEMANTiCS2014 conference in Leipzig on 5th of September 2014 about the 'Tool for Building Energy Performance Scenarios' of GBPN (Global Buildings Performance Network, http://gbpn.org) that provides a prediction tool for buildings performance worldwide by making use of Linked Open Data (LOD).
OpenAIRE services and tools - presentation at #DI4R2016OpenAIRE
Presentation at Digital Infrastrctures for Research Conference 2016 (Sept. 30). Title: Open Access and Open Data in Horizon 2020: for Research managers and Project Coordinators, by Pedro Príncipe (University of Minho)
Thursday 7 December 2023
Setting the scene for future nation and sector-specific conversations.
Similar to MarcEdit task lists and vendor-supplied metadata : revisiting the subscriber-publisher relationship at the University of Leeds / Ceilan Hunter-Green (University of Leeds). (20)
UK Committee on RDA, RDA Day: New Tools for the Future of Cataloguing - Jenny...CILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Challenges to implementation - Jenny WrightCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Application Profiles in RDA - Jenny WrightCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
The Official RDA Toolkit - Opportunities for Efficiency - Thurstan YoungCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
The Official RDA Toolkit - Opportunities for Enrichment - Thurstan YouingCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
RDA methods, scenarios, tools - Gordon DunsireCILIP MDG
“The RDA Day is programmed by the UK Committee on RDA. Using activities and games throughout informative presentations, the RDA Day will inform and engage metadata practitioners and managers on a content standard which integrates well with the metadata needs of the 21st century”
Paper presented on the UKCoR RDA Day during the Metadata & Discovery Group Conference (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Poster: What’s in a name? Re-Discovering cataloguing and index through metada...CILIP MDG
In 2019 CILIP’s Cataloguing and Indexing Group changed its name to the Metadata and Discovery Group. This poster will showcase the transition of the look and feel of the group’s logo and the process of designing and new one.
Poster presented at the CILIP Metadata and Discovery Group (MDG) Conference & UKCoR RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham).
Poster: Revamping our in-house cataloguing training / Victoria Parkinson (Kin...CILIP MDG
With hybrid working and a new LMS, we are revamping our in-house cataloguing training. We are learning from our teaching librarians and using the tools we have, such as Moodle, to create cataloguing training that allows anyone with an interest to learn the basics and making the best use of face-to-face time for putting those skills into practice. Over the past eight years we’ve adapted and updated our in-house training, and I’ll also talk about how we decide what to teach colleagues, and how we try to make the best use of staff time to keep skills up when cataloguing is one of many competing priorities and shared across several teams. Between staff turnover and COVID lockdowns and service changes, we are starting almost from scratch in building a pool of staff who can catalogue the material our suppliers can’t provide records for, which is an excellent time to take stock of what our cataloguing needs are, and advocate for the importance of creating and upgrading good quality records and why we need to build these skills in-house.
Poster presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Poster: FAST : can it lighten the load, and what is the impact? / Jenny Wrigh...CILIP MDG
This poster presents the Faceted Application of Subject Terminology, giving an overview of the scheme, its advantages and potential issues, and its practical implementation. It will demonstrate that FAST is an important development for those interested in Linked Data, and the ways in which it is a useful tool for discovery in any system.
Poster presented at the CILIP Metadata and Discovery Group (MDG) Conference & UKCoR RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham).
Poster: The West Midlands Evidence Repository (WMER) : a regional collaborati...CILIP MDG
The West Midlands Evidence Repository (WMER) was born from a pre-pandemic recognition by managers of Knowledge and Library Services (KLSs) of 8 NHS Trusts in the West Midlands region of the need for a repository. This was to replace existing provision, or recognition of national priorities or local needs to record, collect, and share research, as well as potential for sharing patient information leaflets or guidelines. Some managers and services had previous experience of repositories, as well as being part of a national pilot. WMER, however, represented a new start for all to work in collaboration to establish a new service. The consortium would enable sharing of both costs and experience.
Initially, different repository suppliers were investigated by the KLS that had had a long-established repository, taking on board the experience of the group from the national pilot. The Atmire Open Repository platform was chosen as it met the consortium’s needs and had a proven track record of other collaborative repositories in the NHS. Financing was taken on by one Trust and the on-boarding was led in partnership between that Trust and the Trust that had undertaken the initial investigation.
With the initial on-boarding completed and the test server set-up, the group took a step back to ensure they worked together as a collaborative going forward. Collaborative work between the KLSs was facilitated by the formal creation of two groups, a Managers Group for overall approval and financial decision making and an Operational Group handling the setup and administration of the repository for the consortium. The Operational Group is led by the service with most experience of managing repositories and the lead of it acts as liaison between the two groups, with each group having representation from the eight organisations. Learning from other regional collaborations the Future NHS site was used as a collaborative workspace and Teams as the main means of communication.
The setup of the repository was completed on time after three months. There was initially a steep learning curve for all, especially the Operational Group who undertook this process. The group identified key metadata and metadata standards for the repository, including the use of ORCIDs and the use of Wessex Classification as a controlled vocabulary. The setup process was facilitated by the collaborative nature of the project as the variety of experience in the group was a great benefit. It should be noted support from the suppliers was specifically related to technical support only.
The collaborative nature of the project also allowed work to be shared, and tasks were given to members to be undertaken independently. However, a downside of collaborative projects is that decisions can take longer to be inclusive...
Poster presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Poster: Updating the Wessex Classification Scheme for UK health libraries : a...CILIP MDG
The Wessex Classification Scheme was created by healthcare librarians in the South West of England, and was loosely based on the US National Library of Medicine classification. The scheme is widely used in healthcare libraries across the UK, both inside and outside the NHS. Although the scheme has gone through several revisions, there has been no major update since 2015, so the Wessex Classification Scheme Oversight Group was formed in September 2022 with the support of NHS England. The group aims to bring knowledge and skills from UK health library networks to improve the scheme and offers a chance for participants to develop skills in working with classification and subject indexing, and the opportunity to network widely. By forming a working group, it ensures the longevity of the scheme and shares the maintenance work more widely.
Initially, members were asked which parts of the scheme they felt needed updating the most and sub-groups were formed for LGBTQ+ issues and gender identity (the Pride sub-group), Ethnicity and Race, and Learning Disability and Neurodiversity (the LDN sub-group) as well as a smaller team working on ‘quick and simple’ updates....
Poster presented at the CILIP Metadata and Discovery Group (MDG) Conference & UKCoR RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham).
Revamping in-house cataloguing training / Victoria Parkinson (King's College ...CILIP MDG
With hybrid working and a new LMS, we are revamping our in-house cataloguing training. We are learning from our teaching librarians and using the tools we have, such as Moodle, to create cataloguing training that allows anyone with an interest to learn the basics and making the best use of face-to-face time for putting those skills into practice. Over the past eight years we’ve adapted and updated our in-house training, and I’ll also talk about how we decide what to teach colleagues, and how we try to make the best use of staff time to keep skills up when cataloguing is one of many competing priorities and shared across several teams. Between staff turnover and COVID lockdowns and service changes, we are starting almost from scratch in building a pool of staff who can catalogue the material our suppliers can’t provide records for, which is an excellent time to take stock of what our cataloguing needs are, and advocate for the importance of creating and upgrading good quality records and why we need to build these skills in-house.
Lightning Talk presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
UK NACO funnel : progress, obstacles, and solutions / Martin Kelleher (Univer...CILIP MDG
This Lightning Talk will provide a quick update on latest progress with the now established UK NACO Funnel, which allows participating institutions to contribute to Library of Congress / PCC authority control. The presentation will include a summary of the purpose of the funnel, details of latest expansion, problems and solutions with data submission software, and further plans and collaborations.
Lightning Talk presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Ship[w]right[e]s? : the challenges of cataloguing reports from scientific exp...CILIP MDG
Reports from scientific expeditions represent an important class of bibliographic object held by libraries of natural history institutions. They are, as is increasingly being understood, important as both scientific records providing crucial context for specimen collections, but also as historical documents of the history of empire and colonialisation. At the Natural History Museum, London (NHM) we hold reports and other documentation relating to many of the most significant expeditions from the eighteenth to the twentieth century. In this short paper I would like to draw out some of the issues faced when cataloguing these works from three angles: descriptive cataloguing, subject cataloguing, and authority control. I will consider questions of dependent and independent titles, ships as corporate bodies and other entity relationships, form/genre headings, geographic headings and LCSH.
Lightning Talk presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
BFI Reuben Library : an RDA implementation story / Anastasia Kerameos (BFI Re...CILIP MDG
“From 1st January 2024, Adlib will no longer be supported or maintained by Axiell.” This statement acted as the catalyst for action, enabling the release of resources to implement significant changes to the BFI Reuben Library’s record structure, which in turn prompted a deeper look into our current cataloguing practices and future requirements.
Upgrading to Axiell Collections will allow the library to implement new RDA more fully – we had previously adopted some aspects but not all – and, importantly, it will allow us to better align our data structure with that of the organisation’s other collections, making it easier to manage and making it compatible with further planned system developments. By the time of the conference in September we will be cataloguing to an under the bonnet Work – Expression – Manifestation – Item (WEMI) record hierarchy and new cataloguing guidelines.
Having watched all the webinars available, having read every piece of documentation which seemed relevant, having spent hours reading and re-reading the contents of the RDA Toolkit we are currently working on the last stages of our application profile whilst still debating issues around putting the theory into practice, especially in the area of aggregates and diachronic works. I do not suggest I have all the answers, far from it, but by sharing the story of our journey, that of a medium sized non-academic library of specialist mostly print collections and illustrating it with practical examples I hope my presentation will be of use to others currently travelling a similar path.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
RDA implementation at the British Library / Thurstan Young (British Library)CILIP MDG
On 23rd May 2023, the RDA Board announced that the original RDA Toolkit will be removed in May 2027. All RDA users will need to be prepared for transition to the official RDA Toolkit before then. As previously announced, a Countdown Clock will start running in May 2026, a year before the sunset date.
This paper will provide an update on the British Library’s plans for implementation of the new RDA Toolkit, following completion of the RDA Toolkit Restructure and Redesign (3R) project. It will provide an overview of the timeline and scope for implementation as well as describing the training and documentation underpinning the implementation and the support available to other institutions for their implementation.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Community forward : developing descriptive cataloguing of rare materials (RDA...CILIP MDG
Since 2013, Resource Description and Access (RDA) has been the chief cataloguing standard used in the United States. In 2019, the RDA Steering Committee previewed a new version of the RDA Toolkit, which introduced substantial changes, such as replacing instructions with a series of options, adding new concepts such as “nomens” and “diachronic works,” and replacing the prior organisation with a broader intellectual framework. This revised Toolkit became the official RDA Toolkit in December 2020, with major cataloguing bodies planning to adopt it in the coming years. Some cataloguers have expressed concerns regarding the official RDA Toolkit, particularly around cost and training required to learn the new standard.
In response to these concerns, the RBMS RDA Editorial Group, a group of volunteers from the Association of College and Research Libraries’ Rare Books and Manuscripts Section, developed a new manual, Descriptive Cataloging of Rare Materials (RDA Edition). DCRMR is informed by core principles of community and sustainability while employing open-access publication models and infrastructure. Designed in response to community feedback, it presents instructions in cataloguing workflow order using clear language while remaining aligned to the official RDA Toolkit and RDA element sets. The manual was approved in February 2022 in its first iteration and continues to be actively developed and updated. This presentation will discuss why the editorial group created an open and free manual; the process and tools for creating the manual, including the use of GitHub to publish a cataloguing standard; and outcomes to date.
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
The West Midlands Evidence Repository (WMER) : a regional collaboration proje...CILIP MDG
The West Midlands Evidence Repository (WMER) was born from a pre-pandemic recognition by managers of Knowledge and Library Services (KLSs) of 8 NHS Trusts in the West Midlands region of the need for a repository. This was to replace existing provision, or recognition of national priorities or local needs to record, collect, and share research, as well as potential for sharing patient information leaflets or guidelines. Some managers and services had previous experience of repositories, as well as being part of a national pilot. WMER, however, represented a new start for all to work in collaboration to establish a new service. The consortium would enable sharing of both costs and experience.
Initially, different repository suppliers were investigated by the KLS that had had a long-established repository, taking on board the experience of the group from the national pilot. The Atmire Open Repository platform was chosen as it met the consortium’s needs and had a proven track record of other collaborative repositories in the NHS. Financing was taken on by one Trust and the on-boarding was led in partnership between that Trust and the Trust that had undertaken the initial investigation.
With the initial on-boarding completed and the test server set-up, the group took a step back to ensure they worked together as a collaborative going forward. Collaborative work between the KLSs was facilitated by the formal creation of two groups, a Managers Group for overall approval and financial decision making and an Operational Group handling the setup and administration of the repository for the consortium. The Operational Group is led by the service with most experience of managing repositories and the lead of it acts as liaison between the two groups, with each group having representation from the eight organisations. Learning from other regional collaborations the Future NHS site was used as a collaborative workspace and Teams as the main means of communication.
The setup of the repository was completed on time after three months. There was initially a steep learning curve for all, especially the Operational Group who undertook this process. The group identified key metadata and metadata standards for the repository, including the use of ORCIDs and the use of Wessex Classification as a controlled vocabulary. The setup process was facilitated by the collaborative nature of the project as the variety of experience in the group was a great benefit. It should be noted support from the suppliers was specifically related to technical support only.
The collaborative nature of the project also allowed work to be shared, and tasks were given to members to be undertaken independently. However, a downside of collaborative projects is that decisions can take longer to be inclusive...
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
Authority of assertion in repository contributions to the PID graph / George ...CILIP MDG
The principles surrounding Linked Open Data and their implementation within digital libraries are well understood. Such implementations may be challenging, but successes are now well documented and continue to demonstrate the benefits of disseminating and enriching existing metadata with improved semantics and relational associations. Often facilitated in machine-readability enhancements to metadata by harnessing serializations of the Resource Description Framework (RDF) and its reliance of URIs, these LOD approaches have ensured digital libraries, and similar GLAMR initiatives elsewhere, contribute to the growing knowledge graphs associated with the wider semantic web by declaring statements of fact about web entities. Within open scholarly ecosystems a growing use of persistent identifiers (PIDs) to define and link scholarly entities has emerged, e.g., DOIs, ORCIDs, etc. The requirement for greater URI persistence has been motivated by several developments within the scholarly space; suffice to state that, when combined with appropriate structured data, PIDs can support improvements to resource discovery, as well as facilitate contributions to the ‘PID graph’ – a scholarly data graph describing and declaring associative relations between scholarly entities.
While the increased adoption of PIDs has the potential to transform scholarship, ensuring that these PIDs are used appropriately, encoded correctly within metadata, and that all relevant relational associations between scholarly entities are declared presents challenges. This is especially true within open scholarly repositories, from where many contributions to the PID graph will be made but – unlike many LOD contexts – from where the authority to assert specific relations may not always exist. Such declarations need to demonstrate reliability and provenance and are central to the interlinking of heterogeneous textual objects, datasets, software, research instruments, equipment, and the related PIDs these items may generate, such as for people, organizations, or other abstract entities.
This paper will explore the issues that arise when levels of authority to assert are lacking or are uncertain, and review results from a related study exploring the ‘PID literacy’ of scholars...
Paper presented at the Metadata & Discovery Group Conference & RDA Day (6th - 8th Sept 2023 at IET Austin Court, Birmingham)
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
MarcEdit task lists and vendor-supplied metadata : revisiting the subscriber-publisher relationship at the University of Leeds / Ceilan Hunter-Green (University of Leeds).
1. MARCEDIT TASK LISTS AND
VENDOR-SUPPLIED METADATA:
Ceilan Hunter-Green
Metadata and Discovery Coordinator
University of Leeds
REVISITING THE
SUBSCRIBER-PUBLISHER RELATIONSHIP
AT THE UNIVERSITY OF LEEDS
CILIP Metadata & Discovery Group Conference 2023
#CILIPMDG2023
2. TIMELINE
2
MARCH 2020 JAN 2021 SEPT 2021 AUG 2022 JUNE 2023
First COVID lockdown Purchase individual
Kanopy titles
BFI institutional
subscription
Begin negotiations to
provide records; start
developing tasklist
First batch of all
records sent to
subscribing institutions
3. BACKGROUND:
WHY THE BRITISH
FILM INSTITUTE?
• Strength of British Film Institute collection, particularly foreign-
language and early film history material
• New offering of institutional subscription
• Challenges:
o Around 600 active titles, comparatively small offering vs.
Box of Broadcasts (over 30,000 programmes) and
Kanopy (currently around 21,000 films)
o Incomplete metadata from streaming video platform
3
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
4. BACKGROUND:
CHALLENGES
• Metadata received in the form of an 8-column
spreadsheet:
o Internal BFI ID, Title, Access start date, Access
end date, Country of origin, Year of release,
Genre 1, Genre 2
• Director names added in February 2022
• Basic Alma import profile and MarcEdit Delimited Text
Translator to create basic records
• Manually filled out remaining fields (Cast, Runtime,
Language, Accessibility, Summary, etc) by copy and
pasting from BFI Player website
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
5. RELATIONSHIP
• Acquisitions colleagues about to renegotiate 2022-
2023 subscription terms
• BFI agreed to a subscription discount on condition of
Leeds providing monthly record files
• We requested all metadata held by the BFI Player
streaming video on demand platform
5
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
6. VENDOR SPREADSHEET, FEB 2023
6
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
7. VENDOR SPREADSHEET, MAY 2023
7
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
8. IMPORT TEMPLATE
8
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
9. TASK LIST 1
9
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
10. 10
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
TASK LIST 1
11. TASK LIST 2
11
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
12. 12
TASK LIST 2
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
13. TASK LIST 2
13
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
14. TASK LIST 2
14
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
15. 15
TASK LIST 2
TASK LIST 1
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
16. DEVELOPMENTS
AND POTENTIAL
16
• Addition of Edited column to streamline extension ID workflow
• Addition of LoC URIs to authority entries
• Fine-tuning handover
• Investigating Free package records offer
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
17. IMPACT
17
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
• Streamlined our processes—a fraction of the staffing
resource
• Huge increase in knowledge around Regular Expressions,
MarcEdit, and streaming video cataloguing standards
• Value offer to other subscribing institutions at no additional
cost to them
• Discount on subscription for our University
• Enhanced accessibility cataloguing which improves user
experience
18. Our partnership with the University of Leeds has helped
us to deliver a much-requested resource by our BFI
player subscribing institutions. I have come to learn how
crucial MARC records are in aiding discoverability,
which is of the utmost importance to us, as our aim is for
students and staff to use their BFI player subscriptions to
engage with the cultural value of film and support their
studies. We didn't have the expertise to create them in-
house, and the insight of [the UoL team] has been beyond
valuable.
“
Simone Pyne
Senior Business Development Manager, BFI
simone.pyne@bfi.org.uk
”
18
IMPACT
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
19. QUESTIONS?
Ceilan Hunter-Green
Metadata and Discovery Coordinator
University of Leeds
c.hunter-green@leeds.ac.uk
MarcEdit task lists and vendor-supplied metadata: revisiting the
subscriber-publisher relationship at the University of Leeds
Editor's Notes
Good morning. My name is Ceilan Hunter-Green and I’m one of the Metadata and Discovery Coordinators at the University of Leeds Libraries.
I’m going to be talking today about MarcEdit task lists and vendor-supplied metadata: revisiting the subscriber-publisher relationship at the University of Leeds. Specifically I’ll be going over a project that delivered a new process for creating MARC records for the British Film Institute’s subscription package of streaming videos.
So let me jump straight in to some quick background on this project. Like all other academic libraries, we needed to hugely increase our electronic provision after the first Covid lockdown in March 2020. Most of the pressure was on providing e-textbooks and e-books, but there were also many modules that suddenly needed access to streaming video. At our library the purchasing is handled by a separate Acquisitions and Reading Lists team, who were absolute heroes at identifying and negotiating with new suppliers, and my Metadata and Discovery team were responsible for handling the access and the discoverability of these resources once we had them.
At first we were handling streaming videos one by one from suppliers like Kanopy, until September of 2021 when we started subscribing to BFI’s institutional offer. For that first year we handled the records like any other subscriber and it wasn’t until the following August that we started looking into the options for providing these records to other subscribers. At the end of May we sent out our first batch of records to reflect all films that would be active on the platform as of early June. And now we’re over a year on from that initial negotiation and have been providing the records for four months.
So why did this vendor/subscriber relationship develop with the British Film Institute in particular?
We knew we wanted access to the BFI streaming video collection as it’s strong in areas that other suppliers aren’t, and their films are particularly needed on Japanese and history of film modules where foundational material is difficult to come by otherwise. There were some challenges, though, as their institutional subscription was new at the time and the streaming video service’s metadata format was very different to what we were used to as librarians. They also had a comparatively small collection which is updated very frequently compared to other more static collections. Most significantly to us, they didn’t offer MARC records, which meant that we spent a lot of time on maintaining our local collection of their films. But that became an opportunity for us to offer something back to them.
For the first year of our subscription we, like all the other subscribing institutions, received metadata for the streaming video collection’s films in the form of a very basic spreadsheet, originally 10 columns and upgraded to 12 columns in February 2022 with the addition of director names.
We used MarcEdit’s Delimited Text Translator to create bare-bones records, and then as an ExLibris library running Alma as our LMS we used our import profiles to run certain normalization rules on those basic records to get them into our system in semi-decent MARC shape. But then the remaining metadata that we needed wasn’t provided—so it was copied from the BFIPlayer streaming platform field by field into Alma’s metadata editor with each monthly update. And yes, this was as time-consuming as it sounds.
So, for about a year we were creating records by copying data mostly manually, in order to have an accurate view of the subscription contents and to facilitate reading list linking, which is helped by having a record for each title in a collection. BFI had asked about the process of us providing the records but it didn’t feel practical when the process was so time-consuming.
But as we went into our second year of subscription, with more familiarity with the metadata from BFI Player, more experience with MarcEdit and more confidence in the value and usefulness of our records to the community, we thought we would be able to take another look at the process of record creation.
To cut a long story and a lot of hard work short, our Acquisitions and Reading list colleagues were successful in negotiating a discount, so now we just had to create the records in a much more efficient way than we had been, we needed much more metadata to start with, and the records had to be an even higher standard if they were going to be shared with other institutions.
So just to illustrate the condition of the records we’d been creating by hand, this is the spreadsheet we were getting for the first year of our subscription. We had the title, an ID number, the start and end dates in two different formats, the country of origin, release year, two Genre terms and two Director names, though usually the films just had one director.
Again, all other metadata was copied by hand, one by one, for each record in a monthly update. The average monthly addition was 22 films, but in practice this meant some months had five additions and some months had fifty, so it was an unpredictable demand on our staffing resource.
Apologies for the small text! This is the spreadsheet that BFI now send us, ever since we started providing their records. It’s 56 columns wide and is a huge improvement on that previous spreadsheet. It’s now got all cast members in separate columns, has a unique system identifier from BFI, has runtime, rating, everything we need which previously had to be copied from the BFI Player site in order to fill out our records. It’s also got some additional data like original language titles in addition to English titles, which isn’t on the public BFI Player site, and has a direct URL instead of the general landing page URL we added before so reading lists now link straight to the film rather than to BFI Player where you would need to perform your search again.
The challenge was how to translate a 56-column spreadsheet into a MARC record.
So first can I ask for a show of hands if you use MARC? I think that’s many of us even if you also use archival standards at your library. And keep your hands up if you use MarcEdit?
Building on Claire’s great lightning talk about creating records for archival collections, we start with the MarcEdit import template, which brings that mammoth spreadsheet into MarcEdit through the Delimited Text Translator. That’s mostly a one-to-one import except a couple of places where data from a single column would be added to two fields—like the Cast columns are added both to a concatenated 511 and to individual 700 fields, and the year of release is added both to the 501 and the 046. Some fields are populated as placeholders as you can see in this record, like the 501 for the year of release will be moved to a 500 later, and the director is added to a 701 until we can add a relator term, then they’re moved to a 700 as well. Most of the data as you can see is pretty blunt, just years and dates and yesses and nos. So then we run the task lists.
So for anyone who doesn’t use MarcEdit much, its Task List functions let you join together lists of tasks that you always want it to run the same way. Instead of having to create the tasks from scratch and type out the replacements you want it to make every time, the program will remember the tasks and the order and you can run them all at once. It’s incredibly useful.
For our BFI data we run two task lists. This first set of tasks adds the Library of Congress fields for Short or Feature film depending on the run time in the 300. It’s a bit silly that it has to be a separate task list, but when I combined it with the second list, the second list started to delete the 856 field from every 13th record 🥴 so I decided to let MarcEdit win and stopped fighting it, and let the lists be separate.
This task list has a simple effect but uses a kind of logical puzzle to get there.
First it copies the text of the 300$a into a new 655 if the number in the 300 is over 40, which is the Library of Congress run time threshold for short films.
Then it replaces the text of any 655 with Feature Film, because it’s the only 655 in the record at this point.
It then adds a new 655 field for Short films to all of the records, then deletes that Short film field if a Feature films field is also present in the record. So it goes around the houses but gets there in the end.
The second task list does something similar for Fiction and Nonfiction films based on the presence of the genre term ‘Documentary.’
Here’s what our record looks like after running that first task list. The run time in the 300 field is 136 minutes, so this record needs a Feature Films field. The rest of the record is the same as before, only that new 655 has been added.
The second, more robust task list is 179 tasks long and handles all of the rest of the transformation from essentially a spreadsheet in MarcEdit form, into proper records. It does a few different types of tasks from really basic ones to more nimble ones that cross-reference multiple fields. I won’t get into every single one, but I am doing a poster session and am really happy to share the full list or go into detail if you’re curious about anything I don’t touch on in the next few slides.
So the basic tasks that I mentioned are the standard additions that every record needs and which aren’t conditional on any other fields and don’t involve altering the order or the content of the field text. So with these tasks we add things like the 006 and 007, the 336, 337 and 338, a 264 for BFI’s distribution, a 506 field to say that access is limited to within the UK, and a 588 field to indicate that we’ve constructed these records from vendor-supplied metadata. That’s all straightforward.
It then does some indelicate, brute-force amendments like the ones shown here. Library of Congress wants the Country of Production to be present in a 257 field and those countries should have standardized names according to the Name Authority File source, some of which are not how BFI provides them, so the task list converts them into the correct format. For example BFI might say the country of origin for a particular film was the USSR, but the NAF name is Soviet Union, so this set of tasks standardizes those in the 257 field. They stay in the original BFI phrasing in the 500 field which we’ll look at in a minute. Luckily the University of Leeds doesn’t have to make any geopolitical decisions here.
It then runs a similar batch of edits to delete initial articles from the Original Title field, the 130. MarcEdit isn’t smart enough to know what’s an article in the 130 so I added them in English, German, Spanish, French, and Italian, since they’re the bulk of the films in the collection. I also edited the indicators in the 245 for foreign-language titles in those languages, so there’s a margin of error there for other languages.
There’s another brute-force task to handle the language code in the 008. When we first received the bulk of BFI’s records in May, I went through that spreadsheet and pulled out every language represented in their collection regardless if the film was active on the streaming platform and added these all into the task list cross-referenced with the Marc language codes. I didn’t have to do anything clever with the 008 positions because my data is really consistent, so instead I could just search for any instance of ||und|| and replace it with the correct language code depending on the text of the 546. So in this top right example, any record in which there’s a 546 for Yoruba will perform a find and replace for every instance of ||und||, and because that only appears in the 008, it doesn’t have to be any smarter than that. This is the opposite of the beautiful machine learning that Alan described in yesterday’s keynote, we luckily already have the language so all we have to do is translate it into MARC. This would also be possible to do in Excel before importing the data but this is just how I chose to do it to limit the amount of manual manipulations we have to do with each new spreadsheet.
Since our first big upload of all active titles at the end of May, we’ve had new films in one or two languages not present in this list. But it’s simple enough to search the file for that ||und|| and visually check it against the 546. Some films really do have no dialogue so those are fine to leave, we just check there’s no language there, and replace it if need be.
Another task copies this new 008 language code into a 041 language field. The final language task in the screenshot on the bottom right amends the 546 text to add In [blank] with English subtitles, since all of the streaming video collection is accessible in English.
The list also does some other nifty things, sometimes needing regular expressions and sometimes not, to standardize the data provided in the spreadsheet. Earlier I mentioned the country of origin in the 500 field, so this will stay the way the BFI phrases it and a task changes the field from, for example, ‘United States’ to ‘Country of origin: United States’.
Same with adding ‘Access ends on’ to the beginning of the 506, which otherwise just has a date, adding ‘1 online resource’ and ‘mins’ to the 300 field, adding BBFC Certificate to the rating in the 521 field, changing Not Rated to ‘Certification status unknown,’ and tidying up the accessibility fields that previously just said Yes and No. If they say No, they’re deleted, and depending on the field, if they say Yes they’re amended to Closed Captioning or Audio Descriptions in English, and then all moved into the 532 field.
Here’s another look at how the task list manoeuvers data out of and into the 008 field in the screenshot on the top left. We’ve added the runtime to the 008 position 18 from the 300, but the 008 will only know what to do with that if it’s a three-digit runtime. So there are two more tasks underneath that, to add initial 0s if the runtime is only one or two digits.
There’s another task in the bottom screenshot to add relator terms depending on a genre field. As I said the 700 is initially only for actors, so the task will add a $e actor for all 700s, and then this task runs to change that subfield to on-screen participant if the Genre term in the 653 is Documentary or Short Documentary. The relator term ‘director’ is added to the placeholder 701 field, and then those are all moved to a 700 as well.
A final task in the screenshot on the right reverses the order of the names in all 700s to Last comma First comma relator term. This is not an exact science, as many Chinese and Korean names are already in the Last First order in the spreadsheet and the 511, but it takes less time to make those incorrect first and then correct them when we validate the headings. There’s potential here to add another task to only reverse the order of the names if the language codes KOR or CHI are not present in the 008.
So just to remind you what our record looked like after importing it using our import profile and running the first task list on it, that’s the example on the left. Many of the fields are still pretty indecipherable.
And after we’ve run the second task list, apologies for the small size, but you can see our record’s effectively doubled. Those accessibility fields are now analysable and also discoverable because they’re present correctly in the 341, 347, 532 and also the 655. The relator terms have been added and the 008 has been updated with runtime, year of distribution, year of release, country of distribution, language, form of item and type of visual material. The 700s have been reversed to Last name First name. And we’ve also added our local information into the 040 for when this record is shared with other institutions.
After this task list process is complete—which takes about ten seconds, in spite of how much talking I’ve just done—we do some spot checking to make sure it all appears as it should, and then we use MarcEdit to validate the 700 and 130 headings and correct most of them since the names and the original film titles are not in the Library of Congress authority format. The validation still takes the longest. We leave the subject headings as we’re given them from BFI, but indicate that they’re local headings. And then we create a much shorter and less detailed delete file which matches on the BFI unique identifier in the 024 to delete films whose access has expired, and we send those two files off to BFI to send to their subscribers.
So all of that was a huge amount of work to get done before we started providing the records to other institutions back at the end of May. But there have also been a few improvements since then as we continue to develop our relationship with BFI.
First, our major pain point was identifying films whose access had been extended. They wouldn’t be obvious in the spreadsheet since we add new films based on the access start date, but sometimes older films who had already expired would be extended without the start date being edited. So BFI have added an additional column to indicate which access dates have been edited in the past month, and this has made the process of identifying those extensions so much more efficient.
Another enhancement we’ve made is to start adding Library of Congress URIs to our validated 700 entries, thanks to MarcEdit again—there’s an option to do this when you’re validating the headings.
At the moment, we receive this spreadsheet of titles in the last week of the month and have a few working days to turn around the files before they’re then emailed out to other institutions. Most suppliers host files like these for institutions to download rather than emailing them out, so there’s a potential there for streamlining the supplying process. We’re also looking for more clarity on how many other subscribing institutions use Ex Libris Alma for their LMS like we do, because there’s potential to use Alma’s Community Zone to share records, but of course that’s only helpful with a critical mass of Alma users.
And finally BFI offer a package of freely available material in addition to the subscription films, and some subscribing institutions have expressed interest in getting records for those films as well. We’re currently in talks with BFI to understand the turnover and demand for those Free titles and may be able to offer this in future.
The biggest impact on my team is that a process that used to take at least a day, up to four days sometimes, is now maximum a few hours, including the time it takes to validate the Library of Congress authority headings.
It’s also been an incredibly useful exercise for us, both in terms of a huge stretch project for our team and my personal understanding of MarcEdit and regular expressions, and also in developing our knowledge of streaming video cataloguing standards.
We’ve been able to offer value to our fellow UK HE institutions who no longer have to create their own records in the painstaking way that we were. These MARC records are also now provided with no price increase for subscribers which is great and fairly rare I believe. Plus of course the discount for our own subscription is a bonus!
And of course it's had a positive impact for students and other catalogue users now that the records are provided faster and to a higher standard, especially for things like the accessibility fields which are much more user-friendly now.
From the BFI side it’s also been a useful partnership. We’ve gotten some really encouraging feedback from them, including this from Simone Pyne who is BFI’s Senior Business Development Manager.
Simone said “Our partnership with the University of Leeds has helped us to deliver a much-requested resource by our BFI player subscribing institutions. I have come to learn how crucial MARC records are in aiding discoverability, which is of the utmost importance to us, as our aim is for students and staff to use their BFI player subscriptions to engage with the cultural value of film and support their studies. We didn't have the expertise to create these records in-house, and the insight of the UoL team has been beyond valuable.”
So we’re really pleased to be ambassadors for library metadata standards and for MARC records, and of course also thrilled to have the feedback that the relationship is mutually beneficial!
Simone’s also happy to hear from anyone interested in the BFI Player institutional subscription, so please feel free to direct any questions or inquiries about that to her, her email’s simone.pyne@bfi.org.uk.
So there we have it! This project has been an incredibly useful exercise for me and my colleagues, but it’s also opened a door for us in terms of agency within our supplier-vendor relationships, and has provided value to our vendor BFI. We know there still areas for potential development to come, but for the time being we’re very happy with the work we’ve done so far.
If there are any questions I’m happy to answer them now, or I’ll be loitering around my poster for the next few breaks and sessions. Please do also feel free to email me with any other questions (or suggestions, especially if you’re one of the institutions who receive these records!)
Thanks so much for your time.
***if no questions, Library Juice courses on MarcEdit and streaming cataloguing taught me reg ex and the new fields we’d need to add***