The Provenance and History
of the Manuscripts formerly
in the Phillipps Collection
Department of Digital HumanitiesToby Burrows
The Phillipps manuscript collection
• Phillipps’ own printed catalogue
(1837-1871) goes up to no. 23,837
• Thomas Fitzroy Fenwick (grandson,
d. 1938) spent fifty years
reorganizing and renumbering: up to
no. 38,628
• Fenwick’s estimate of the total was
close to 60,000 volumes and
individual documents
• Phillipps also owned 50,000 books,
as well as many prints, photographs,
drawings and paintings
Sir Thomas Phillipps (1792-1872)
Assembling the collection
Meerman
1824
Meerman
1824
Celotti
1825
Celotti
1825
Craven
Ord
1829-32
Craven
Ord
1829-32Page-
Turner
1824
Page-
Turner
1824
Lang
1828
Lang
1828
Drury
1827
Drury
1827
Guilford
(North)
1830
Guilford
(North)
1830
Heber
1836
Heber
1836
Van Ess
1823-6
Van Ess
1823-6
PHILLIPPSPHILLIPPS
Libri
1859-62
Libri
1859-62
Re-creation of Phillipps’ shelves, Grolier Club
Dispersal of the collection
Fenwick family (1886-1945):
• Sales to interested libraries and governments (Germany, Belgium,
Netherlands, France, Ireland, Wales) – more than 2,500 items
• Auctions at Sotheby’s, 1886 to 1938 – 22 auctions, more than 22,000
lots, raised £97,000 (over £30 million)
• Residue (12,000 items) sold to the Robinson brothers in 1945 for
£100,000 (£11-12 million)
W.H. Robinson Ltd (1945-1958):
•Series of sale catalogues, 1945-1954
•Donation to the Bodleian Library of the remaining materials, 1958
Sotheby’s (1946-1950, 1965-1977):
•Series of sale catalogues
Data sources
Source Format Comments
Schoenberg Database of
Manuscripts
Relational database Incorporates other sources, esp. sales catalogues
6,000 Phillipps MSS; 20,000 Phillipps events
Library catalogues (BL, KB etc.) Relational databases
Generally MARC records
Provenance in notes
Export can be awkward
Union catalogues
Relational databases
Printed bibliographies
Formats vary
Coverage varies
Export can be awkward
Sale catalogues
Printed books (some digitized)
Online sources (PDFs, Web sites)
Many included in Schoenberg
MSS in ABE, eBay etc.
Phillipps catalogues and lists
Printed book; Partly digitized
Supplemented by handwritten notes
Partly included in Schoenberg
Handwritten notes not digitized
Phillipps provenance indexes (BL,
IRHT)
Handwritten; Not digitized
Arranged by Phillipps number
No longer updated
Annotated sales catalogues &
printed catalogues
Handwritten; Not digitized
Researchers (Munby), owners (Phillipps), auctioneers
(Sotheby’s)
Held in Cambridge UL, Bodleian, BL
Project summary
• Two main research questions:
– The history and significant characteristics of the transmission
of a major group of European manuscripts between collections
and collectors over the centuries (provenance)
– The applicability and value of Linked Data technologies as a
methodology for the large-scale analysis of the history of
cultural objects and collections (“network archaeology”)
• Project plan
– Ingest the data; transform them to a common Data Model;
represent them computationally; analyse and visualize them;
make them available to other researchers
• Tools
– Excel, OpenRefine, Neo4j, Nodegoat, visualization tools
In 1862, Sir Thomas Phillipps bought Phillipps MS 16402 in London
as part of the Sotheby’s sale of the collection of Guglielmo Libri.
London
1862
MS16402
Libri
Phillipps
Sotheby’s
Neo4j: graph database
• Nodes and relationships (each
with properties)
• Various tools for data import
• Cypher query language for
creating nodes, relationships
and properties
• Cypher is also used to run
queries, analyse paths, count
and list
• No schema as such – develop
and define as you go
• Own visualization interface, but
also works with others
• Data export – JSON
Neo4j Data Model – nodes (entities)
Node (entity: label) Type Properties
AGENT Person
Organization
name
OBJECT Manuscript id
title
foliation
layout
binding
illustration
WORK Text
Description
Exhibition
title
incipit
language
PUBLICATION Catalogue
Book
Article
title
Neo4j Data Model – relationships
Relationship Properties
GAVE
SOLD
CONSIGNED
OWNS
ACQUIRED
date
id
certitude
price
PRODUCED date
certitude
CONTAINS locus
SAME_AS certitude
Relationship Properties
COMPOSED
TRANSLATED
COMPILED
date
certitude
ANNOTATED
INSCRIBED
date
locus
certitude
DESCRIBED_IN date
item no.
DESCRIBED_AS date
item no.
Neo4j Data Model – relationship statements
Node Relationship Node
AGENT: Person GAVE OBJECT: Manuscript
AGENT: Organization SOLD OBJECT: Manuscript
OBJECT: Manuscript CONTAINS WORK: Text
AGENT: Person COMPOSED WORK: Text
PUBLICATION: Catalogue CONTAINS WORK: Description
OBJECT: Manuscript DESCRIBED_AS WORK: Description
AGENT: Organization PRODUCED PUBLICATION: Catalogue
WORK: Exhibition DESCRIBED_IN PUBLICATION: Catalogue
DATA MODEL – Nodegoat
Object Sub-objects Related to:
PERSON Nationality (country) Manuscript
Text
Catalogue
ORGANIZATION Location (city; country) Manuscript
Text
Catalogue
MANUSCRIPT Sold
Donated
Owned
Described In
Produced
Contents
Person/Organization: Agent,
Owner, Buyer, Donor,
Recipient, Scribe, Artist,
Producer
Location (city; country)
Catalogue
Text
TEXT Person: Author
Manuscript
CATALOGUE Organization: Publisher
Person: Compiler
Manuscript
Current status of the project
• Data imported: selections from Schoenberg data, other sample
data
• Data Model: theoretical work + working versions
• Demo versions of Neo4j and Nodegoat databases
• Tested and documented queries, analyses and visualizations
• To come:
– Adding much more data in a production environment
(Nodegoat)
– Carrying out more extensive visualizations and analyses
• Across the whole collection
• In relation to specific “use cases”
– Exporting data for reuse by other researchers
Dr Toby Burrows
Marie Curie Fellow
Department of Digital Humanities
King’s College London
26-29 Drury Lane
London WC2B 5RL
toby.burrows@kcl.ac.uk
@tobyburrows
tobyburrows.wordpress.com

Icms 2015 burrows

  • 1.
    The Provenance andHistory of the Manuscripts formerly in the Phillipps Collection Department of Digital HumanitiesToby Burrows
  • 2.
    The Phillipps manuscriptcollection • Phillipps’ own printed catalogue (1837-1871) goes up to no. 23,837 • Thomas Fitzroy Fenwick (grandson, d. 1938) spent fifty years reorganizing and renumbering: up to no. 38,628 • Fenwick’s estimate of the total was close to 60,000 volumes and individual documents • Phillipps also owned 50,000 books, as well as many prints, photographs, drawings and paintings Sir Thomas Phillipps (1792-1872)
  • 3.
  • 4.
    Re-creation of Phillipps’shelves, Grolier Club
  • 5.
    Dispersal of thecollection Fenwick family (1886-1945): • Sales to interested libraries and governments (Germany, Belgium, Netherlands, France, Ireland, Wales) – more than 2,500 items • Auctions at Sotheby’s, 1886 to 1938 – 22 auctions, more than 22,000 lots, raised £97,000 (over £30 million) • Residue (12,000 items) sold to the Robinson brothers in 1945 for £100,000 (£11-12 million) W.H. Robinson Ltd (1945-1958): •Series of sale catalogues, 1945-1954 •Donation to the Bodleian Library of the remaining materials, 1958 Sotheby’s (1946-1950, 1965-1977): •Series of sale catalogues
  • 6.
    Data sources Source FormatComments Schoenberg Database of Manuscripts Relational database Incorporates other sources, esp. sales catalogues 6,000 Phillipps MSS; 20,000 Phillipps events Library catalogues (BL, KB etc.) Relational databases Generally MARC records Provenance in notes Export can be awkward Union catalogues Relational databases Printed bibliographies Formats vary Coverage varies Export can be awkward Sale catalogues Printed books (some digitized) Online sources (PDFs, Web sites) Many included in Schoenberg MSS in ABE, eBay etc. Phillipps catalogues and lists Printed book; Partly digitized Supplemented by handwritten notes Partly included in Schoenberg Handwritten notes not digitized Phillipps provenance indexes (BL, IRHT) Handwritten; Not digitized Arranged by Phillipps number No longer updated Annotated sales catalogues & printed catalogues Handwritten; Not digitized Researchers (Munby), owners (Phillipps), auctioneers (Sotheby’s) Held in Cambridge UL, Bodleian, BL
  • 9.
    Project summary • Twomain research questions: – The history and significant characteristics of the transmission of a major group of European manuscripts between collections and collectors over the centuries (provenance) – The applicability and value of Linked Data technologies as a methodology for the large-scale analysis of the history of cultural objects and collections (“network archaeology”) • Project plan – Ingest the data; transform them to a common Data Model; represent them computationally; analyse and visualize them; make them available to other researchers • Tools – Excel, OpenRefine, Neo4j, Nodegoat, visualization tools
  • 10.
    In 1862, SirThomas Phillipps bought Phillipps MS 16402 in London as part of the Sotheby’s sale of the collection of Guglielmo Libri. London 1862 MS16402 Libri Phillipps Sotheby’s
  • 11.
    Neo4j: graph database •Nodes and relationships (each with properties) • Various tools for data import • Cypher query language for creating nodes, relationships and properties • Cypher is also used to run queries, analyse paths, count and list • No schema as such – develop and define as you go • Own visualization interface, but also works with others • Data export – JSON
  • 12.
    Neo4j Data Model– nodes (entities) Node (entity: label) Type Properties AGENT Person Organization name OBJECT Manuscript id title foliation layout binding illustration WORK Text Description Exhibition title incipit language PUBLICATION Catalogue Book Article title
  • 13.
    Neo4j Data Model– relationships Relationship Properties GAVE SOLD CONSIGNED OWNS ACQUIRED date id certitude price PRODUCED date certitude CONTAINS locus SAME_AS certitude Relationship Properties COMPOSED TRANSLATED COMPILED date certitude ANNOTATED INSCRIBED date locus certitude DESCRIBED_IN date item no. DESCRIBED_AS date item no.
  • 14.
    Neo4j Data Model– relationship statements Node Relationship Node AGENT: Person GAVE OBJECT: Manuscript AGENT: Organization SOLD OBJECT: Manuscript OBJECT: Manuscript CONTAINS WORK: Text AGENT: Person COMPOSED WORK: Text PUBLICATION: Catalogue CONTAINS WORK: Description OBJECT: Manuscript DESCRIBED_AS WORK: Description AGENT: Organization PRODUCED PUBLICATION: Catalogue WORK: Exhibition DESCRIBED_IN PUBLICATION: Catalogue
  • 19.
    DATA MODEL –Nodegoat Object Sub-objects Related to: PERSON Nationality (country) Manuscript Text Catalogue ORGANIZATION Location (city; country) Manuscript Text Catalogue MANUSCRIPT Sold Donated Owned Described In Produced Contents Person/Organization: Agent, Owner, Buyer, Donor, Recipient, Scribe, Artist, Producer Location (city; country) Catalogue Text TEXT Person: Author Manuscript CATALOGUE Organization: Publisher Person: Compiler Manuscript
  • 27.
    Current status ofthe project • Data imported: selections from Schoenberg data, other sample data • Data Model: theoretical work + working versions • Demo versions of Neo4j and Nodegoat databases • Tested and documented queries, analyses and visualizations • To come: – Adding much more data in a production environment (Nodegoat) – Carrying out more extensive visualizations and analyses • Across the whole collection • In relation to specific “use cases” – Exporting data for reuse by other researchers
  • 28.
    Dr Toby Burrows MarieCurie Fellow Department of Digital Humanities King’s College London 26-29 Drury Lane London WC2B 5RL toby.burrows@kcl.ac.uk @tobyburrows tobyburrows.wordpress.com

Editor's Notes

  • #2 I’m going to talk about the manuscript collection of Sir Thomas Phillipps, one of the great 19th-century collectors I’ll be reporting on a European Union project aimed at reconstructing the Phillipps collection, and about the ways in which I’m using new technologies to achieve this
  • #3 The size of the collection – almost certainly the biggest private collection ever assembled; bigger than most (all?) public collections Phillipps was not just a collector of manuscripts
  • #4 Phillipps was buying at a good time – many private libraries came on the market in 1820s and 1830s especially A period when prices for manuscripts rose quite sharply in Britain – Phillipps played a significant part in this price rise
  • #5 Phillipps was the illegitimate son and sole heir of a wealthy Manchester mill owner After filling his stately home at Middle Hill (Gloucestershire) with his collection, he then moved it all to Thirlestaine House in Cheltenham in 1864 This gives some idea of the profusion of the collection Grolier Club, New York – these are actual documents from the Phillipps Collection
  • #6 The collection was inherited by one of Phillipps’ daughters and her husband Most of the dispersal was managed by Phillipps’ grandson Thomas Fitzroy Fenwick (died in 1938) Robinson brothers’ sales were followed by a series of sales by other antiquarian dealers, especially Sotheby’s, through to the mid-1970s Still documents advertised for sale on sites like ABE Books today
  • #7 Today, a wide variety of sources of information about the Phillipps manuscripts Produced for different purposes, in very varied formats – some in digital form, others not digitized No comprehensive list of Phillipps manuscripts; no consolidated source of information about their history
  • #8 SDM includes almost 20,000 transactions relating to Phillipps manuscripts – about 6,000 of the manuscripts are covered Assembling the data for my project – started with a CSV export from the Schoenberg Database, filtered for Phillipps transactions
  • #9 Other data sources are more difficult to make use of This is a page from a list of some of the Phillipps MSS, made for probate purposes – note the alterations and revisions, and the short descriptions There are two different hand-written versions of this list, with slightly different coverage This one is in the Grolier Club Library in New York (the other is in the Bodleian Library)
  • #10 Two aspects of the project – firstly, the history of the Phillipps Collection: study of the provenance of the manuscripts on a large scale Looking particularly at the patterns of relationships between the manuscripts and the people and organizations involved in their history – both individually and collectively Secondly, the use of digital tools and data modelling methodologies to represent these patterns, and to serve as a basis for visualization and analysis
  • #11 Look at data modelling first There are a variety of ways of representing provenance in a computational setting – none is entirely satisfactory I went back to a basic conceptual model Here is a typical provenance event statement Here is a simple conceptual model of this event showing entities (nouns in blue) linked by their actions or roles (verbs in red) + properties or attributes of these actions (orange) – places and dates when they occurred
  • #12 I then looked for software which managed data in a way which was similar to that kind of conceptual model A graph database like Neo4j can show provenance events as a series of nodes and relationships which correspond to the nouns and verbs in that model Neo4j enables path analysis and pattern matching – not just quantitatively, but also by looking for specific chains of relationships
  • #13 I then had to try and develop a detailed Data Model using the Neo4j notation These are the basic entities (nodes/labels) + types + some key properties “Object” is primarily a physical entity: the manuscript volume itself “Work” is a conceptual entity – the text carried by a manuscript, or the description of a manuscript contained in a catalogue (like FRBR’s Work)
  • #14 These are the basic relationships (verbs) + some key properties The properties relate to the action, not to the entities involved in the action Not just ownership-related transactions – also want to include description-related transactions
  • #15 Here are some sample relationship statements expressed in Neo4j notation
  • #16 This shows Phillipps MSS now owned by Columbia University, the Morgan Library and some other US libraries, together with their donors: George Plimpton, William S. Glazier and others Data from the Schoenberg Database
  • #17 And this shows some manuscripts which were once owned by both Phillipps and Chester Beatty, and sold by Sotheby’s I’ve expanded the network for Phillipps 12283, so it also shows the works contained in this manuscript, and their authors, as well as a catalogue description for it (Data from the Schoenberg Database)
  • #18 A few screen shots from my small Neo4j graph database using Neo4j’s own visualization interface This shows the Phillipps MSS now owned by the Royal Library in The Hague, with their Phillipps numbers and the titles of the works they contain (data from the KB catalogue)
  • #19 But I ran into significant limitations with Neo4j – both in the way it handles data modelling, and in its capacity for visualization and analysis So I’ve also been testing an alternative approach using software called Nodegoat Developed in the Netherlands at the University of Amsterdam
  • #20 Nodegoat uses a structure based on types of objects, which can each have sub-objects The sub-objects can serve as event clusters, as you can see from my data model “Manuscript” is the central object; its sub-objects are mostly event types The sub-objects can include links to related objects – especially people and organizations, who play different roles depending on the type of event
  • #21 I currently have a test data collection in Nodegoat, involving 100 Phillipps manuscripts and about 250 provenance transactions (data from the Schoenberg Database) Here is an example of a manuscript object – its description is the top half Its associated sub-objects are summarized in the lower half This MS has three “Sold” sub-objects, as well as “Owned”, “Produced” and “Contents” (a link to the text it contains) Produced in 1580 in the UK, then sold three times between 1815 and 1967, owned by Yale University in 2010
  • #22 Nodegoat has interesting visualization interfaces – geographical, social networks, and chronological This is the geographical visualization interface, showing how manuscripts have moved over the centuries, both around Europe and to the United States This is for the whole of the sample dataset (200 MSS) – you can also limit the visualization to specific manuscripts or groups of manuscripts
  • #23 A closer look at this map - you can see the cities where manuscripts were produced (in purple) The lines reflect movements due to sales (in blue) or donations (in orange) or other changes in ownership (red)
  • #24 You can also use the time slider to see how the pattern of movement changed over time This is the picture up to 1937 – only a couple of MSS in the dataset had moved to the United States before that date
  • #25 Here is the chronological visualization – showing the relative numbers of sales since 1750 (in blue/purple)
  • #26 And finally, the social visualization showing the network of connections in the sample dataset of 100 Phillipps manuscripts Shows the centrality of Sir Thomas Phillipps (big red circle), Sotheby’s (green) and France (white) as a place of origin Also a time slider to view the changes in the network over time
  • #27 Can zoom in to inspect each node and each relationship Here’s a magnification of part of the graph The three major nodes are Sotheby’s (green: 54 sales transactions), Phillipps (red: 47 sales transactions) and France (white: where 32 of the manuscripts were originally produced)
  • #28 Conclude with a summary of the project to date My aim is to ingest as much data as possible during the project, and to make the platform available for addition of further data in the future by other researchers Effectively building an information system about the Phillipps manuscripts, while also developing generic models for the provenance of cultural heritage objects