Explore the hidden life of your objects ceramics and silver
Icms 2015 burrows
1. The Provenance and History
of the Manuscripts formerly
in the Phillipps Collection
Department of Digital HumanitiesToby Burrows
2. The Phillipps manuscript collection
• Phillipps’ own printed catalogue
(1837-1871) goes up to no. 23,837
• Thomas Fitzroy Fenwick (grandson,
d. 1938) spent fifty years
reorganizing and renumbering: up to
no. 38,628
• Fenwick’s estimate of the total was
close to 60,000 volumes and
individual documents
• Phillipps also owned 50,000 books,
as well as many prints, photographs,
drawings and paintings
Sir Thomas Phillipps (1792-1872)
5. Dispersal of the collection
Fenwick family (1886-1945):
• Sales to interested libraries and governments (Germany, Belgium,
Netherlands, France, Ireland, Wales) – more than 2,500 items
• Auctions at Sotheby’s, 1886 to 1938 – 22 auctions, more than 22,000
lots, raised £97,000 (over £30 million)
• Residue (12,000 items) sold to the Robinson brothers in 1945 for
£100,000 (£11-12 million)
W.H. Robinson Ltd (1945-1958):
•Series of sale catalogues, 1945-1954
•Donation to the Bodleian Library of the remaining materials, 1958
Sotheby’s (1946-1950, 1965-1977):
•Series of sale catalogues
6. Data sources
Source Format Comments
Schoenberg Database of
Manuscripts
Relational database Incorporates other sources, esp. sales catalogues
6,000 Phillipps MSS; 20,000 Phillipps events
Library catalogues (BL, KB etc.) Relational databases
Generally MARC records
Provenance in notes
Export can be awkward
Union catalogues
Relational databases
Printed bibliographies
Formats vary
Coverage varies
Export can be awkward
Sale catalogues
Printed books (some digitized)
Online sources (PDFs, Web sites)
Many included in Schoenberg
MSS in ABE, eBay etc.
Phillipps catalogues and lists
Printed book; Partly digitized
Supplemented by handwritten notes
Partly included in Schoenberg
Handwritten notes not digitized
Phillipps provenance indexes (BL,
IRHT)
Handwritten; Not digitized
Arranged by Phillipps number
No longer updated
Annotated sales catalogues &
printed catalogues
Handwritten; Not digitized
Researchers (Munby), owners (Phillipps), auctioneers
(Sotheby’s)
Held in Cambridge UL, Bodleian, BL
7.
8.
9. Project summary
• Two main research questions:
– The history and significant characteristics of the transmission
of a major group of European manuscripts between collections
and collectors over the centuries (provenance)
– The applicability and value of Linked Data technologies as a
methodology for the large-scale analysis of the history of
cultural objects and collections (“network archaeology”)
• Project plan
– Ingest the data; transform them to a common Data Model;
represent them computationally; analyse and visualize them;
make them available to other researchers
• Tools
– Excel, OpenRefine, Neo4j, Nodegoat, visualization tools
10. In 1862, Sir Thomas Phillipps bought Phillipps MS 16402 in London
as part of the Sotheby’s sale of the collection of Guglielmo Libri.
London
1862
MS16402
Libri
Phillipps
Sotheby’s
11. Neo4j: graph database
• Nodes and relationships (each
with properties)
• Various tools for data import
• Cypher query language for
creating nodes, relationships
and properties
• Cypher is also used to run
queries, analyse paths, count
and list
• No schema as such – develop
and define as you go
• Own visualization interface, but
also works with others
• Data export – JSON
12. Neo4j Data Model – nodes (entities)
Node (entity: label) Type Properties
AGENT Person
Organization
name
OBJECT Manuscript id
title
foliation
layout
binding
illustration
WORK Text
Description
Exhibition
title
incipit
language
PUBLICATION Catalogue
Book
Article
title
13. Neo4j Data Model – relationships
Relationship Properties
GAVE
SOLD
CONSIGNED
OWNS
ACQUIRED
date
id
certitude
price
PRODUCED date
certitude
CONTAINS locus
SAME_AS certitude
Relationship Properties
COMPOSED
TRANSLATED
COMPILED
date
certitude
ANNOTATED
INSCRIBED
date
locus
certitude
DESCRIBED_IN date
item no.
DESCRIBED_AS date
item no.
14. Neo4j Data Model – relationship statements
Node Relationship Node
AGENT: Person GAVE OBJECT: Manuscript
AGENT: Organization SOLD OBJECT: Manuscript
OBJECT: Manuscript CONTAINS WORK: Text
AGENT: Person COMPOSED WORK: Text
PUBLICATION: Catalogue CONTAINS WORK: Description
OBJECT: Manuscript DESCRIBED_AS WORK: Description
AGENT: Organization PRODUCED PUBLICATION: Catalogue
WORK: Exhibition DESCRIBED_IN PUBLICATION: Catalogue
15.
16.
17.
18.
19. DATA MODEL – Nodegoat
Object Sub-objects Related to:
PERSON Nationality (country) Manuscript
Text
Catalogue
ORGANIZATION Location (city; country) Manuscript
Text
Catalogue
MANUSCRIPT Sold
Donated
Owned
Described In
Produced
Contents
Person/Organization: Agent,
Owner, Buyer, Donor,
Recipient, Scribe, Artist,
Producer
Location (city; country)
Catalogue
Text
TEXT Person: Author
Manuscript
CATALOGUE Organization: Publisher
Person: Compiler
Manuscript
20.
21.
22.
23.
24.
25.
26.
27. Current status of the project
• Data imported: selections from Schoenberg data, other sample
data
• Data Model: theoretical work + working versions
• Demo versions of Neo4j and Nodegoat databases
• Tested and documented queries, analyses and visualizations
• To come:
– Adding much more data in a production environment
(Nodegoat)
– Carrying out more extensive visualizations and analyses
• Across the whole collection
• In relation to specific “use cases”
– Exporting data for reuse by other researchers
28. Dr Toby Burrows
Marie Curie Fellow
Department of Digital Humanities
King’s College London
26-29 Drury Lane
London WC2B 5RL
toby.burrows@kcl.ac.uk
@tobyburrows
tobyburrows.wordpress.com
Editor's Notes
I’m going to talk about the manuscript collection of Sir Thomas Phillipps, one of the great 19th-century collectors
I’ll be reporting on a European Union project aimed at reconstructing the Phillipps collection, and about the ways in which I’m using new technologies to achieve this
The size of the collection – almost certainly the biggest private collection ever assembled; bigger than most (all?) public collections
Phillipps was not just a collector of manuscripts
Phillipps was buying at a good time – many private libraries came on the market in 1820s and 1830s especially
A period when prices for manuscripts rose quite sharply in Britain – Phillipps played a significant part in this price rise
Phillipps was the illegitimate son and sole heir of a wealthy Manchester mill owner
After filling his stately home at Middle Hill (Gloucestershire) with his collection, he then moved it all to Thirlestaine House in Cheltenham in 1864
This gives some idea of the profusion of the collection
Grolier Club, New York – these are actual documents from the Phillipps Collection
The collection was inherited by one of Phillipps’ daughters and her husband
Most of the dispersal was managed by Phillipps’ grandson Thomas Fitzroy Fenwick (died in 1938)
Robinson brothers’ sales were followed by a series of sales by other antiquarian dealers, especially Sotheby’s, through to the mid-1970s
Still documents advertised for sale on sites like ABE Books today
Today, a wide variety of sources of information about the Phillipps manuscripts
Produced for different purposes, in very varied formats – some in digital form, others not digitized
No comprehensive list of Phillipps manuscripts; no consolidated source of information about their history
SDM includes almost 20,000 transactions relating to Phillipps manuscripts – about 6,000 of the manuscripts are covered
Assembling the data for my project – started with a CSV export from the Schoenberg Database, filtered for Phillipps transactions
Other data sources are more difficult to make use of
This is a page from a list of some of the Phillipps MSS, made for probate purposes – note the alterations and revisions, and the short descriptions
There are two different hand-written versions of this list, with slightly different coverage
This one is in the Grolier Club Library in New York (the other is in the Bodleian Library)
Two aspects of the project – firstly, the history of the Phillipps Collection: study of the provenance of the manuscripts on a large scale
Looking particularly at the patterns of relationships between the manuscripts and the people and organizations involved in their history – both individually and collectively
Secondly, the use of digital tools and data modelling methodologies to represent these patterns, and to serve as a basis for visualization and analysis
Look at data modelling first
There are a variety of ways of representing provenance in a computational setting – none is entirely satisfactory
I went back to a basic conceptual model
Here is a typical provenance event statement
Here is a simple conceptual model of this event showing entities (nouns in blue) linked by their actions or roles (verbs in red) + properties or attributes of these actions (orange) – places and dates when they occurred
I then looked for software which managed data in a way which was similar to that kind of conceptual model
A graph database like Neo4j can show provenance events as a series of nodes and relationships which correspond to the nouns and verbs in that model
Neo4j enables path analysis and pattern matching – not just quantitatively, but also by looking for specific chains of relationships
I then had to try and develop a detailed Data Model using the Neo4j notation
These are the basic entities (nodes/labels) + types + some key properties
“Object” is primarily a physical entity: the manuscript volume itself
“Work” is a conceptual entity – the text carried by a manuscript, or the description of a manuscript contained in a catalogue (like FRBR’s Work)
These are the basic relationships (verbs) + some key properties
The properties relate to the action, not to the entities involved in the action
Not just ownership-related transactions – also want to include description-related transactions
Here are some sample relationship statements expressed in Neo4j notation
This shows Phillipps MSS now owned by Columbia University, the Morgan Library and some other US libraries, together with their donors: George Plimpton, William S. Glazier and others
Data from the Schoenberg Database
And this shows some manuscripts which were once owned by both Phillipps and Chester Beatty, and sold by Sotheby’s
I’ve expanded the network for Phillipps 12283, so it also shows the works contained in this manuscript, and their authors, as well as a catalogue description for it
(Data from the Schoenberg Database)
A few screen shots from my small Neo4j graph database using Neo4j’s own visualization interface
This shows the Phillipps MSS now owned by the Royal Library in The Hague, with their Phillipps numbers and the titles of the works they contain
(data from the KB catalogue)
But I ran into significant limitations with Neo4j – both in the way it handles data modelling, and in its capacity for visualization and analysis
So I’ve also been testing an alternative approach using software called Nodegoat
Developed in the Netherlands at the University of Amsterdam
Nodegoat uses a structure based on types of objects, which can each have sub-objects
The sub-objects can serve as event clusters, as you can see from my data model
“Manuscript” is the central object; its sub-objects are mostly event types
The sub-objects can include links to related objects – especially people and organizations, who play different roles depending on the type of event
I currently have a test data collection in Nodegoat, involving 100 Phillipps manuscripts and about 250 provenance transactions (data from the Schoenberg Database)
Here is an example of a manuscript object – its description is the top half
Its associated sub-objects are summarized in the lower half
This MS has three “Sold” sub-objects, as well as “Owned”, “Produced” and “Contents” (a link to the text it contains)
Produced in 1580 in the UK, then sold three times between 1815 and 1967, owned by Yale University in 2010
Nodegoat has interesting visualization interfaces – geographical, social networks, and chronological
This is the geographical visualization interface, showing how manuscripts have moved over the centuries, both around Europe and to the United States
This is for the whole of the sample dataset (200 MSS) – you can also limit the visualization to specific manuscripts or groups of manuscripts
A closer look at this map - you can see the cities where manuscripts were produced (in purple)
The lines reflect movements due to sales (in blue) or donations (in orange) or other changes in ownership (red)
You can also use the time slider to see how the pattern of movement changed over time
This is the picture up to 1937 – only a couple of MSS in the dataset had moved to the United States before that date
Here is the chronological visualization – showing the relative numbers of sales since 1750 (in blue/purple)
And finally, the social visualization showing the network of connections in the sample dataset of 100 Phillipps manuscripts
Shows the centrality of Sir Thomas Phillipps (big red circle), Sotheby’s (green) and France (white) as a place of origin
Also a time slider to view the changes in the network over time
Can zoom in to inspect each node and each relationship
Here’s a magnification of part of the graph
The three major nodes are Sotheby’s (green: 54 sales transactions), Phillipps (red: 47 sales transactions) and France (white: where 32 of the manuscripts were originally produced)
Conclude with a summary of the project to date
My aim is to ingest as much data as possible during the project, and to make the platform available for addition of further data in the future by other researchers
Effectively building an information system about the Phillipps manuscripts, while also developing generic models for the provenance of cultural heritage objects