Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Victor de Boer
Linked Open Data
(for Archives)
Studiemiddag Linked Data Archieven
CC-by-nc-nd https://www.flickr.com/photos/joinash/
The ‘problem’
(archival) data is not integrated
• Data is “lost” or published without
reusability
– In different physical ...
Linked Data
• Machine readable format
• Standardized
• Flexibility to connect heterogenous data
– Link what can be linked
...
What is Linked Open Data?
Open Data
is about licenses to allow reuse
Linked Data
is about technology for interoperability
★
Available on the web (whatever
format), but with an open license
★★
Available as machine-readable
structured data (e.g. ...
URIs
Use Web URIs for things you want to talk about.
http://rijksmuseum.nl/data/painting1
My application can go there usin...
RDF Triples form Graphs
rijks:Painting001
geo:Haarlem
rijks:Frans_Hals
147590
52.38084, 4.63683
geo:Noord-Holland
geo:Neth...
http://lod-cloud.net/
Some Examples
Dutch Ships and Sailors
KB NEWSPAPERS
Dutch-Asiatic Shipping“VOC Opvarenden”
Jur Leinenga
Matthias van Rossum
Elbing voyag...
HETEROGENEOUS but LINKED DATAMODELS
dss:Record
gzmvoc:Telling
gzmvoc:telling-1046-De_Berkel
__bnode_1
gzmvoc:aziatischeBem...
Metadata and original scans
hasOriginalScan
Reuse: Links to other web resources
Historical Newspapers
http://delpher.nl
isReferencedBy
[HARLINGEN, 24 October.]
…gestr...
mdb:Schip1 mdb:Kof
mdb:scheepsType
das:ShipX das:Kofship
das:typeOfShip
Aat:Kof
Aat:Platbodems
skos: broader
Reuse: backgr...
Data analysis and visualisation
The interlinked knowledge graph makes it possible to
efficiently investigate integrated re...
Mediapark Hilversum
Ontstaan in 1997:
3 archieven (RVD, AVAC/Omroepen,
Stichting Film en Wetenschap)
1 museum (Omroepmuseu...
NISV’s task
1.Safeguard the wealth of
Dutch AV material
Archive,
knowledge centre
2. Access to this wealth
Archival collec...
“The Archive as a Laboratory”
Openbeelden.nl / openimages.eu
Open source (MMBase, FFmpeg, LAMP)
Open media formats (Ogg Theora, WebM)
Open standards (Du...
∼800,000h
∼110h
Reuse: 3rd party applications
http://www.beeldvoorbeeld.nl/tv/
...TO CONTEXT: MUTUALLY CONNECTED
COLLECTIONS...
20-9-2016
Connecting collections:
topics, people, genres, etc
Catalogue
P...
External: Networked heritage
Links through vocabularies (such as GTAA)
Concept: Jan Sluijters (schilder)
DBpedia
Related items
Links
• Styles (Expressionism,
Cubism, Fauvism)
• Period (contempo...
Gemeenschappelijke Thesaurus
Audiovisuele Archieven (GTAA)
~60.000 terms:
Subjects, Persons, 27.000 Names, Locations, Genr...
Alignment service
http://CultuurLINK.beeldengeluid.nl
ALIGNMENT
Application: Browsing Dutch-Flemish connections
Application: Browsing Dutch-Flemish connections
http://link.spinque.com/VIAA-1.0/
Connecting within Clariah
WP3: Texts
WP4: Census data
WP5: Media
Digital Humanities application: DIVE
FOUR DATA SOURCES
DIVE:MEDIA OBJECT SEM:EVENT
SEM:PLACE
SEM:TIME
SEM:ACTOR
SKOS:CONCEPT
OA:ANNOTATION
Exploratory interface
DIVE.BEELDENGELUID.NL
Wrap up
1. Publishing information as Linked Data
allows machines to re-use your data for
all kinds of search, browsing, an...
https://www.flickr.com/photos/miwok/24266853554
More than one road
Publish and link metadata (EAD?)
Metadata + content (DSS)
Start with a small part (Openbeelden)
Publish...
v.de.boer@vu.nl
http://victordeboer.com
Thank you
EAD to (EDM) RDF
BIG ? LINKED? OPEN?
Wrap up
• Graphs, not tables
• Web standards and technologies
• URIs for things
• RDF (triples) to describe these things a...
BIG DATA
LINKED DATA
OPEN DATA
Four V’s of Big Data
http://www.ey.com/GL/en/Services/Advisory/EY-big-data-big-opportunities-big-challenges
BIG DATA
LINKED DATA
VARIANCE as one of the V’s of
Big Data
VOLUME, VELOCITY are
challenges for LD
LINKED DATA
OPEN DATA
LINKED OPEN DATA
as the way to publish and reuse
datasets across the Web
www.w3.org/designissues/linkeddata.html
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"
Upcoming SlideShare
Loading in …5
×

Presentatie for "Studiemiddag Linked Data Archieven"

548 views

Published on

Linked Data for Archives

Published in: Education
  • Be the first to comment

Presentatie for "Studiemiddag Linked Data Archieven"

  1. 1. Victor de Boer Linked Open Data (for Archives) Studiemiddag Linked Data Archieven
  2. 2. CC-by-nc-nd https://www.flickr.com/photos/joinash/
  3. 3. The ‘problem’ (archival) data is not integrated • Data is “lost” or published without reusability – In different physical locations – In different file formats – In different semantic structures • We do not want to force one monolithic data model • Flexible integration • Re-use existing data sources
  4. 4. Linked Data • Machine readable format • Standardized • Flexibility to connect heterogenous data – Link what can be linked • re-use and re-usability OBJECT EVENT PLACE TIME PERSON CONCEPT PROVENANCE
  5. 5. What is Linked Open Data?
  6. 6. Open Data is about licenses to allow reuse Linked Data is about technology for interoperability
  7. 7. ★ Available on the web (whatever format), but with an open license ★★ Available as machine-readable structured data (e.g. excel instead of image scan of a table) ★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel) ★★★★ All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff ★★★★★ All the above, plus: Link your data to other people’s data to provide context www.w3.org/designissues/linkeddata.html Linked Data five star system www.w3.org/designissues/linkeddata.html
  8. 8. URIs Use Web URIs for things you want to talk about. http://rijksmuseum.nl/data/painting1 My application can go there using HTTP (the Web) and I get information about it HTML page for humans RDF data for machines
  9. 9. RDF Triples form Graphs rijks:Painting001 geo:Haarlem rijks:Frans_Hals 147590 52.38084, 4.63683 geo:Noord-Holland geo:Netherlands rijks:Painting002
  10. 10. http://lod-cloud.net/
  11. 11. Some Examples
  12. 12. Dutch Ships and Sailors KB NEWSPAPERS Dutch-Asiatic Shipping“VOC Opvarenden” Jur Leinenga Matthias van Rossum Elbing voyagesArchangel voyages
  13. 13. HETEROGENEOUS but LINKED DATAMODELS dss:Record gzmvoc:Telling gzmvoc:telling-1046-De_Berkel __bnode_1 gzmvoc:aziatischeBemanning dss:Ship gzmvoc:Schip gzmvoc: schip-1046-De_Berkel dss:has_ship gzmvoc:schip "1046" “Schip” “De Berkel” rdfs:label dss:scheepsnaam gzmvoc:scheepsnaam dss:ShipType gzmvoc:Scheepstype gzmvoc: type-Ship dss:has_shiptype gzmvoc:has_shiptype gzmvoc:scheepstype “21” “Moorse mattroosen” dss:azRegistratieKop gzmvoc:azAantalMatrozen gzmvoc:telling gzmvoc:heeft DAS heenreis dss:Record das:Voyage das:voyage-1918_61 Integrate datasets No monolithic datamodel needed No normalisation / dumbing down of data needed Retain original model and intent
  14. 14. Metadata and original scans hasOriginalScan
  15. 15. Reuse: Links to other web resources Historical Newspapers http://delpher.nl isReferencedBy [HARLINGEN, 24 October.] …gestrand. Tevens is het berigt ontvan°e > dat het hier behoorende schoonerschip Transit, kapitein Schaap, in de Noordzee is gezonken, nadat het achterschip was weggeslagen ; een ligtmatroos verloor daarbij het leven. Mede zijn hier drie vreemde schepen met meer en minder zware averij binnengeloopen.
  16. 16. mdb:Schip1 mdb:Kof mdb:scheepsType das:ShipX das:Kofship das:typeOfShip Aat:Kof Aat:Platbodems skos: broader Reuse: background knowledge AAT = Art & Architecture Thesaurus http://www.getty.edu/research/tools/vocabularies/aat/ OWL = Web ontology language https://www.w3.org/OWL/
  17. 17. Data analysis and visualisation The interlinked knowledge graph makes it possible to efficiently investigate integrated research questions.
  18. 18. Mediapark Hilversum Ontstaan in 1997: 3 archieven (RVD, AVAC/Omroepen, Stichting Film en Wetenschap) 1 museum (Omroepmuseum) Grootste audiovisueel archief 70% audio-visual heritage material > 800.000 hrs
  19. 19. NISV’s task 1.Safeguard the wealth of Dutch AV material Archive, knowledge centre 2. Access to this wealth Archival collection for program makers Media Experience for everyone
  20. 20. “The Archive as a Laboratory”
  21. 21. Openbeelden.nl / openimages.eu Open source (MMBase, FFmpeg, LAMP) Open media formats (Ogg Theora, WebM) Open standards (Dublin Core, CC-REL, HTML 5) Open API (OAI-PMH, CC-0) Open content (CC-licenses, PD Mark) CC BY-SA preferable 3000 items “Internet Quality”
  22. 22. ∼800,000h ∼110h
  23. 23. Reuse: 3rd party applications http://www.beeldvoorbeeld.nl/tv/
  24. 24. ...TO CONTEXT: MUTUALLY CONNECTED COLLECTIONS... 20-9-2016 Connecting collections: topics, people, genres, etc Catalogue Photos B&GWiki Programme guides Connected (internal)
  25. 25. External: Networked heritage Links through vocabularies (such as GTAA)
  26. 26. Concept: Jan Sluijters (schilder) DBpedia Related items Links • Styles (Expressionism, Cubism, Fauvism) • Period (contemporaries) Application: LinkedTV
  27. 27. Gemeenschappelijke Thesaurus Audiovisuele Archieven (GTAA) ~60.000 terms: Subjects, Persons, 27.000 Names, Locations, Genres Published as SKOS Linked Open Data http://gtaa.beeldengeluid.nl/ http://data.beeldengeluid.nl/gtaa/30173 (Gymnastiekprogramma) http://data.beeldengeluid.nl/gtaa/35878 (Haarlem)
  28. 28. Alignment service http://CultuurLINK.beeldengeluid.nl
  29. 29. ALIGNMENT Application: Browsing Dutch-Flemish connections
  30. 30. Application: Browsing Dutch-Flemish connections http://link.spinque.com/VIAA-1.0/
  31. 31. Connecting within Clariah WP3: Texts WP4: Census data WP5: Media
  32. 32. Digital Humanities application: DIVE
  33. 33. FOUR DATA SOURCES DIVE:MEDIA OBJECT SEM:EVENT SEM:PLACE SEM:TIME SEM:ACTOR SKOS:CONCEPT OA:ANNOTATION
  34. 34. Exploratory interface DIVE.BEELDENGELUID.NL
  35. 35. Wrap up 1. Publishing information as Linked Data allows machines to re-use your data for all kinds of search, browsing, analysis, applications. 2. Flexible integration of heterogeneous data, metadata and background knowledge 3. (Re)use of web resources, vocabularies and ontologies enriches the original data
  36. 36. https://www.flickr.com/photos/miwok/24266853554
  37. 37. More than one road Publish and link metadata (EAD?) Metadata + content (DSS) Start with a small part (Openbeelden) Publish and link vocabulary (Cultuurlink?)
  38. 38. v.de.boer@vu.nl http://victordeboer.com Thank you
  39. 39. EAD to (EDM) RDF
  40. 40. BIG ? LINKED? OPEN?
  41. 41. Wrap up • Graphs, not tables • Web standards and technologies • URIs for things • RDF (triples) to describe these things and have links • Distributed, heterogeneous (meta)data • Integrate datasets in a flexible way • Cross-collection, -institution, -domain • Re-use background knowledge • Provenance fits DH requirements well • Knowledge graphs enable efficient investigation of integrated research questions.
  42. 42. BIG DATA LINKED DATA OPEN DATA
  43. 43. Four V’s of Big Data http://www.ey.com/GL/en/Services/Advisory/EY-big-data-big-opportunities-big-challenges
  44. 44. BIG DATA LINKED DATA VARIANCE as one of the V’s of Big Data VOLUME, VELOCITY are challenges for LD
  45. 45. LINKED DATA OPEN DATA LINKED OPEN DATA as the way to publish and reuse datasets across the Web
  46. 46. www.w3.org/designissues/linkeddata.html

×