The document discusses metadata remediation procedures and workflow. It details completed projects cleaning up metadata for over 41,000 digital objects across inactive collections. The remediation process extracts raw metadata, analyzes the data, applies best practices, prepares worksheets, cleans up identifiers, enhances documentation, and updates the metadata. Remediation supports an upcoming digital asset management system migration by cleaning data to make it easier and more consistent to migrate. Excel is used as the remediation tool for its advanced features, ease of use, and ability to handle large datasets.
3. Updates
Completed Next
● Inactive digital collections
○ NV Test Site
○ Menus
○ Showgirls
○ Boomtown
○ ∞
3 projects
6 remediation cycles
41,937 objects
4. How | Why?
How we remediate? Why is remediation important?
● Supports the DAMS migration project
● High priority for Phase 1 of migration
project
● Clean data is easier to migrate
● Clean data is consistent
● Data is uniform across all collections
Tool: Excel
● Advanced features
○ Formulas
○ Functions
○ VBA code
● Advantages
○ Easy to use
○ Easy to share
○ Easy to learn
○ Powerful for large sets of data
Editor's Notes
Let me introduce you to the metadata remediation workflow and share some project updates, and explain why metadata remediation is one of the high priority projects in Digital Collections this year.
Darnelle is taking the lead on this project and I am honored to work with him.
This brief talk introduces the metadata remediation workflow and updates from my perspective.
Extract raw metadata - exporting original metadata from ContentDm in a text file and importing it in Excel spreadsheet
Analyze data - find abnormalities or patterns; think of ways to incorporate best practices from past projects or ways to twist best practices to accommodate the new project peculiarities
Best practices include
(1) check if data imported well - no errors in any cells
(2) check for duplicates prior to data clean up
(3) look for inconsistencies across the fields
(4) review old metadata profile fields and make decisions how to map values to new profile to capture all legacy metadata
(5) version control
Prepare worksheets includes creating Excel spreadsheets for remediating complex fields like subject, name authorities, spatial fields.
REMEDIATION process includes assigning ARKs (persistent identifiers) to all digital objects at a parent level, cleaning up messy data to improve consistency across all collections, and enhancing certain fields by adding metadata or extracting metadata and moving it to a more appropriate field (for example moving dates from description to date field to enable faceting)
Updating documentation includes updating all shared spreadsheets and documents what was completed, future considerations and future projects, recommendations and updating other Migration Workgroup members what been done.
Meanwhile, the remediation process doesn’t happen on its own - it’s rather a collaboration that stretches out of Digital Collections to WADS and Technical Services. I’d like to emphasize its a great deal of searching for metadata and file discrepancies in the Vault as well as inaccuracies in the metadata and the Finding aids and correcting those as part of a bigger effort to migrate clean data in the new DAMS.
During the testing and learning phase of the remediation work, I noticed that data is unique and it’s hard to draw conclusions and create a standard remediation procedure.
Darnelle trained me and showed me several ways to accomplish the same result, but often I had to go beyond and twist the procedures to learn a new method or to accommodate collection with very peculiar unstandardized data.
The good thing is that after many trials and errors, and several remediations of the same data sets, some patterns started to emerge and I’m able to continue the remediation work with more structured approach that I can apply to particular fields. Sadly, due to the unique content of the collections I can’t apply the same strategies on a collection level, but rather on a metadata field level. So, I can say I developed best practices for my own workflow to make my work more efficient.