The Carnegie Museum of Art is attempting to structure provenance data so curators, scholars, and software developers can create visualizations that answer questions that would be difficult or impossible to answer without computer assistance. Provenance, the written description of the history of ownership and custody of art, is typically written as a list of the periods, places, and owners of an artwork. It captures the current best understanding of this history in a succinct and precise manner, and illustrates the gaps and uncertainties that still remain. Provenance is typically written as semi-structured text, following an institution-defined format. It would be useful to have a structured, computer-readable format for this data, allowing for search, visualization, and aggregated research.
The American Alliance of Museums suggested standard, widely used across museums, is not defined with enough specificity to allow automated extraction of the structured data contained within provenance texts. Also, the provenance record model in collection management systems (CMS) is often not designed for structured data or does not provide a way to verify that the provenance text matches the structured data. A comprehensive text-based provenance standard, paired with a software library that can parse records written using this standard and convert them into structured data, would allow existing workflows to remain in place while allowing structured data to be automatically extracted from provenance records. The records could continue to be stored within existing CMS databases but contain machine-readable data for use in research and visualization. Outside of data itself, the stories these objects hold are often moving and sometimes astonishing. This ability to ask impossible questions and receive answers previously inaccessible across a museum’s collection and (eventually) across many museums’ collections is a resource art historians and scholars will find extremely valuable.