Dutch Ships and Sailors 
Linked Data Cloud 
Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra 
With input from Andrea Bravo Balado and Robin Ponstein 
Netherlands Institute for Sound and Vision / VU University Amsterdam 
v.de.boer@vu.nl 
ISWC2014
The Problem: 
((Maritime) historical) data is not integrated 
25+ Maritime datasets; Heterogeneous
The solution 
Well, Linked Data obviously!
But why Linked Data 
• Heterogeneous models, one dataformat 
– Link what can be linked 
– Keep specificity of original data 
– Allow integration at project level (and beyond) 
• Links to other sources: re-use knowledge 
• Extensible 
• Allow multiple levels of semantic enrichment/ 
normalization 
– through Named Graphs 
– Provenance
Dutch Ships and Sailors 
KB Delpher 
Dutch-Asiatic Shipping (DAS) – 
Voyages (Huygens ING) 
“VOC Opvarenden” 
Mustering and payroll information (DANS Easy)
Modeling in collaboration with historians (1) 
dss:Record 
mdb:PersoonsContract 
mdb:persoonscontract-del_ 
gem-1879-101-16858- 
Pieter_Hoekstra 
dss:Record 
mdb:Aanmonstering 
mdb:aanmonstering-del_gem-1879- 
101 
dss:Schip 
mdb:Schip 
mdb:schip-del_gem-1879-101-Isadora 
dss:ship 
mdb:ship 
“1870-1894" 
"Isadora 
" 
“32” 
rdfs:label 
dss:shipname 
mdb:scheepsnaam 
dss:ShipType 
mdb:ScheepsTy 
pe 
mdb:schoener 
dss:shiptype 
mdb:scheepstype 
dcterms:identifier 
mdb:inventarisnummer 
mdb:has_KB_article 
<http://resolver.kb.nl/resolve? 
urn=ddd:010063756:mpeg21:a004 
5:ocr> 
mdb:schip-del_gem-1879-137- 
Isadora 
owl:sameAs 
dss:has_aanmonstering 
mdb:has_person 
foaf:Person 
dss:Person 
mdb:Person 
mdb:persoon-del_gem-1879-101-16858 
dss:ran 
k 
mdb:ra 
nk 
dss:Rank 
mdb:Rang 
mdb:matroos 
mdb:maandgage 
“Pieter" 
foaf:firstname 
mdb:voornaa 
m 
“Hoekstr 
a" 
foaf:lastname 
mdb:achternaam 
Jur Leinenga 
(Huygens ING) 
Muster-rolls 
Northern Provinces 
1803-1937
Modeling in collaboration with historians (2) 
dss:Record 
gzmvoc:Telling 
gzmvoc:telling-1046- 
De_Berkel __bnode_ 
gzmvoc:aziatischeBemannin1g 
dss:Ship 
gzmvoc:Schip 
gzmvoc: schip-1046- 
De_Berkel 
dss:has_ship 
gzmvoc:schip 
"1046" 
“Moorse 
mattroose 
n” 
dss:azRegistratieKop 
“De Berkel” 
“Schip” 
rdfs:label 
dss:scheepsnaam 
gzmvoc:scheepsnaam 
gzmvoc:scheepstype 
dss:ShipType 
gzmvoc:Scheepst 
ype 
gzmvoc: type- 
Ship 
dss:has_shiptype 
gzmvoc:has_shiptype 
“21” 
gzmvoc:azAantalMatrozen 
gzmvoc:telling 
gzmvoc:heeft DAS heenreis 
dss:Record 
das:Voyage 
das:voyage- 
1918_61 
Matthias van Rossum (VU-hist) 
Payroll information for European 
vs Asiatic Sailors (17th / 18th C)
Modelling principles 
• Model each dataset as directly as possible 
– Only “syntactical” transformation to RDF 
– No normalization 
• Reusability 
• Transparency, trust 
• Normalize and link in second stage 
– store in separate RDF Named Graphs
Link properties and classes to 
interoperability layer 
rdfs:subPropertyOf 
mdb:scheepsType 
mdb:Schip1 mdb:Kof 
das:typeOfShip 
ddssss:h:haass__sshhipipTTyyppee 
rdfs:subPropertyOf 
das:ShipX das:Kofship
http://semanticweb.cs.vu.nl/amalgame/ 
mdb:scheepsType 
mdb:Schip1 mdb:Kof 
das:typeOfShip 
das:ShipX das:Kofship 
Aat:Platbodems 
skos:exactMatch 
Aat:Kof 
skos:exactMatch 
skos:exactMatch 
Vocabulary Links 
Links to DBPedia (Ship types, places, ranks) 
Links to Getty AAT (Ship types, ranks) 
Links to GeoNames (Places)
Identifying ships 
Date ShipName ShipType ShipSize HomePort CurrentPort Captain 
1852-02-27 Alberdiena kof NULL NULL Noorwegen (N) Wolkammer Albert Augustinus 
1852-07-31 Alberdina kof NULL Farmsum Friedrichstadt (D) Wolkammer Albert A. 
1861-09-30 Alberdina kof 98 NULL Gdansk, Danzig (PL) Wolkammer Albert Augustinus 
1870-03-08 Alberdina brik 222 NULL NULL Wolkammer Albert Augustinus 
1875-09-22 Alberdina bark 309 NULL Oostzee Wolkammer Augustinus 
• Identify ships within a dataset using Machine 
Learning techniques 
– Based on: name, size, type, destinations etc. 
– Background knowledge 
• 33,435 owl:sameAs links 
– Robin Ponstein
Linking to Historical newspapers 
• Use ML to detect links 
between ships and historical 
newspaper articles 
(delpher.nl) 
– Features: ship name, time 
intervals, captain’s names, ship 
type, named entities, keywords, 
background knowledge 
• 179,120 links 
- Andrea Bravo Balado
Example 
[HARLINGEN, 24 October.] . «et gestrande 
Zweedsche schip , waarvan wij ons vorig no. 
melding maakten , is door de 'eepboot van hier 
afgebragt en hier binnengede u BiJ die gelegenheid 
werd ons medegeeeid, dat nog vier vaartuigen op 
Terschelling aren gestrand. Tevens is het berigt 
ontvan°e > dat het hier behoorende schoonerschip 
Transit, kapitein Schaap, in de Noordzee is 
gezonken, nadat het achterschip was 
weggeslagen ; een ligtmatroos verloor daarbij het 
leven. Mede zijn hier drie vreemde schepen met 
meer en minder zware averij binnengeloopen. 
Spoiler alert! It sank in the North Sea.
Provenance (PROV-O) 
• Individual named graphs have provenance 
information 
– Who made it (people/software?) 
– Based on what source 
– Content confidence 
• Matches historical 
science requirements
ClioPatria Triplestore 
• Data live at Huygens Institute for Dutch 
History 
– http://dutchshipsandsailors.nl/data 
– ~30 Million triples 
• Dev. Server 
– http://semanticweb.cs.vu.nl/dss 
• Purl.org URIs redirect to live server w/ content 
negotiation 
• SPARQL endpoint 
• Web interface
DAS 
GZMVOC 
MDB 
VOCOPV 
Begunstig 
VOCOPV 
Soldijboek 
den 
en 
PROV 
AAT 
VOCOPV 
Opvaren 
den 
foaf 
dss:hasKBLink 
owl:sameAs 
rdfs:subClassOf, 
rdfs:subPropertyOf 
dss:DAS link 
skos :exactMatch
Data analysis and visualisation
Current work: linking original scans
Take home 
• Linked Data principles are a great fit to digital 
history requirements 
– Heterogeneous models/datasets, light-weight 
reusable integration 
– Multiple levels of normalisation, through separate 
named graphs 
– SW Provenance matches Historical Provenance 
• Watch out when you sail your Schooner into the 
North Sea
DataLab 
http://dutchshipsandsailors.nl/data 
v.de.boer@vu.nl

Presentation Dutch Ships and Sailors at ISWC2014

  • 1.
    Dutch Ships andSailors Linked Data Cloud Victor de Boer, Matthias van Rossum, Jur Leinenga, Rik Hoekstra With input from Andrea Bravo Balado and Robin Ponstein Netherlands Institute for Sound and Vision / VU University Amsterdam v.de.boer@vu.nl ISWC2014
  • 2.
    The Problem: ((Maritime)historical) data is not integrated 25+ Maritime datasets; Heterogeneous
  • 3.
    The solution Well,Linked Data obviously!
  • 4.
    But why LinkedData • Heterogeneous models, one dataformat – Link what can be linked – Keep specificity of original data – Allow integration at project level (and beyond) • Links to other sources: re-use knowledge • Extensible • Allow multiple levels of semantic enrichment/ normalization – through Named Graphs – Provenance
  • 5.
    Dutch Ships andSailors KB Delpher Dutch-Asiatic Shipping (DAS) – Voyages (Huygens ING) “VOC Opvarenden” Mustering and payroll information (DANS Easy)
  • 6.
    Modeling in collaborationwith historians (1) dss:Record mdb:PersoonsContract mdb:persoonscontract-del_ gem-1879-101-16858- Pieter_Hoekstra dss:Record mdb:Aanmonstering mdb:aanmonstering-del_gem-1879- 101 dss:Schip mdb:Schip mdb:schip-del_gem-1879-101-Isadora dss:ship mdb:ship “1870-1894" "Isadora " “32” rdfs:label dss:shipname mdb:scheepsnaam dss:ShipType mdb:ScheepsTy pe mdb:schoener dss:shiptype mdb:scheepstype dcterms:identifier mdb:inventarisnummer mdb:has_KB_article <http://resolver.kb.nl/resolve? urn=ddd:010063756:mpeg21:a004 5:ocr> mdb:schip-del_gem-1879-137- Isadora owl:sameAs dss:has_aanmonstering mdb:has_person foaf:Person dss:Person mdb:Person mdb:persoon-del_gem-1879-101-16858 dss:ran k mdb:ra nk dss:Rank mdb:Rang mdb:matroos mdb:maandgage “Pieter" foaf:firstname mdb:voornaa m “Hoekstr a" foaf:lastname mdb:achternaam Jur Leinenga (Huygens ING) Muster-rolls Northern Provinces 1803-1937
  • 7.
    Modeling in collaborationwith historians (2) dss:Record gzmvoc:Telling gzmvoc:telling-1046- De_Berkel __bnode_ gzmvoc:aziatischeBemannin1g dss:Ship gzmvoc:Schip gzmvoc: schip-1046- De_Berkel dss:has_ship gzmvoc:schip "1046" “Moorse mattroose n” dss:azRegistratieKop “De Berkel” “Schip” rdfs:label dss:scheepsnaam gzmvoc:scheepsnaam gzmvoc:scheepstype dss:ShipType gzmvoc:Scheepst ype gzmvoc: type- Ship dss:has_shiptype gzmvoc:has_shiptype “21” gzmvoc:azAantalMatrozen gzmvoc:telling gzmvoc:heeft DAS heenreis dss:Record das:Voyage das:voyage- 1918_61 Matthias van Rossum (VU-hist) Payroll information for European vs Asiatic Sailors (17th / 18th C)
  • 8.
    Modelling principles •Model each dataset as directly as possible – Only “syntactical” transformation to RDF – No normalization • Reusability • Transparency, trust • Normalize and link in second stage – store in separate RDF Named Graphs
  • 9.
    Link properties andclasses to interoperability layer rdfs:subPropertyOf mdb:scheepsType mdb:Schip1 mdb:Kof das:typeOfShip ddssss:h:haass__sshhipipTTyyppee rdfs:subPropertyOf das:ShipX das:Kofship
  • 10.
    http://semanticweb.cs.vu.nl/amalgame/ mdb:scheepsType mdb:Schip1mdb:Kof das:typeOfShip das:ShipX das:Kofship Aat:Platbodems skos:exactMatch Aat:Kof skos:exactMatch skos:exactMatch Vocabulary Links Links to DBPedia (Ship types, places, ranks) Links to Getty AAT (Ship types, ranks) Links to GeoNames (Places)
  • 11.
    Identifying ships DateShipName ShipType ShipSize HomePort CurrentPort Captain 1852-02-27 Alberdiena kof NULL NULL Noorwegen (N) Wolkammer Albert Augustinus 1852-07-31 Alberdina kof NULL Farmsum Friedrichstadt (D) Wolkammer Albert A. 1861-09-30 Alberdina kof 98 NULL Gdansk, Danzig (PL) Wolkammer Albert Augustinus 1870-03-08 Alberdina brik 222 NULL NULL Wolkammer Albert Augustinus 1875-09-22 Alberdina bark 309 NULL Oostzee Wolkammer Augustinus • Identify ships within a dataset using Machine Learning techniques – Based on: name, size, type, destinations etc. – Background knowledge • 33,435 owl:sameAs links – Robin Ponstein
  • 12.
    Linking to Historicalnewspapers • Use ML to detect links between ships and historical newspaper articles (delpher.nl) – Features: ship name, time intervals, captain’s names, ship type, named entities, keywords, background knowledge • 179,120 links - Andrea Bravo Balado
  • 13.
    Example [HARLINGEN, 24October.] . «et gestrande Zweedsche schip , waarvan wij ons vorig no. melding maakten , is door de 'eepboot van hier afgebragt en hier binnengede u BiJ die gelegenheid werd ons medegeeeid, dat nog vier vaartuigen op Terschelling aren gestrand. Tevens is het berigt ontvan°e > dat het hier behoorende schoonerschip Transit, kapitein Schaap, in de Noordzee is gezonken, nadat het achterschip was weggeslagen ; een ligtmatroos verloor daarbij het leven. Mede zijn hier drie vreemde schepen met meer en minder zware averij binnengeloopen. Spoiler alert! It sank in the North Sea.
  • 14.
    Provenance (PROV-O) •Individual named graphs have provenance information – Who made it (people/software?) – Based on what source – Content confidence • Matches historical science requirements
  • 15.
    ClioPatria Triplestore •Data live at Huygens Institute for Dutch History – http://dutchshipsandsailors.nl/data – ~30 Million triples • Dev. Server – http://semanticweb.cs.vu.nl/dss • Purl.org URIs redirect to live server w/ content negotiation • SPARQL endpoint • Web interface
  • 16.
    DAS GZMVOC MDB VOCOPV Begunstig VOCOPV Soldijboek den en PROV AAT VOCOPV Opvaren den foaf dss:hasKBLink owl:sameAs rdfs:subClassOf, rdfs:subPropertyOf dss:DAS link skos :exactMatch
  • 17.
    Data analysis andvisualisation
  • 18.
    Current work: linkingoriginal scans
  • 19.
    Take home •Linked Data principles are a great fit to digital history requirements – Heterogeneous models/datasets, light-weight reusable integration – Multiple levels of normalisation, through separate named graphs – SW Provenance matches Historical Provenance • Watch out when you sail your Schooner into the North Sea
  • 20.

Editor's Notes

  • #6 Monsterrollen-database 1803-1937: Monsterrollen zijn bemanningslijsten met naam, rang, gage, woonplaats en leeftijd van elke zeeman aan boord, evenals de naam, het type en de grootte van het schip. […] voor Groningen en Friesland ligt het begin pas in de negentiende eeuw. Ze gunnen ons een kijkje in het beroepsleven van de zeeman in de negentiende en begin twintigste eeuw. Matthias van Rossum onderzocht de verhoudingen tussen Europese en Aziatische zeelieden onder de Verenigde Oost-Indische Compagnie (1602-1795) erg gelijkwaardig waren. Dat is in scherp contrast met de latere 19de eeuwse situatie, toen Aziatische zeelieden in een ongelijkwaardige en soms onvrijere positie werkten onder slechtere behandeling en beloning. Het werken onder de VOC werd bovendien gekenmerkt door een nuchter multiculturalisme.