Kai Eckert - A Linked Data based Infrastructure for DM2E

1,561 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,561
On SlideShare
0
From Embeds
0
Number of Embeds
801
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Kai Eckert - A Linked Data based Infrastructure for DM2E

  1. 1. A Linked Data based Infrastructure for DM2E (WP2) Kai Eckert University of Mannheim DM2E All WP Meeting November 30th, 2012 ViennaKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 1
  2. 2. Agenda Motivation: Follow the Linked Data Principles Linked Data Architecture Provenance in the Europeana Data Model OAI-ORE vs. Named Graphs Linked Data Publishing with ProvenanceKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 2
  3. 3. Architecture Integrated WebclientKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 3
  4. 4. The Europeana Data Model (EDM) Provenance realized by means of OAI-ORE. Problems? Users have to understand Proxies. Users have to understand Aggregations. Wouldnt named graphs be nicer?
  5. 5. Removing the proxies Proxies are (proxy-) resources for the actual resources. Every data provider has an "own" resource to describe, as a placeholder. Practical approach: we use named graphs to distinguish descriptions from different providers within our store.Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 5
  6. 6. Removing the Aggregations? What is an aggregation? "Aggregations are used in Europeana to represent the complex constructs that are provided by contributors. An aggregation is associated to the object that it is about, by the property edm:aggregatedCHO." Level of aggregation: 1 aggregation per providedCHO. EuropeanaAggregation aggregates other aggregations (from data providers).Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 6
  7. 7. A Named Graph per Resource Corresponds to the EDM Aggregations. Finegrained... feasible? Named Graphs as first class members in the model.Statements about theaggregation that areonly valid for oneresource!If we allow this, thenamed graph mustnever get lost! Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 7
  8. 8. A Named Graph per Collection This information must not get lost, too. But: It is not only valid for one resource. We are now more flexible regarding the publication of the data. But: Where are the aggregations?Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 8
  9. 9. Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 9
  10. 10. One Named Graph per Provided Dataset Naturally fits to provenance requirements: All statements stem from some dataset. Positive aspect: Dataproviders do not have to care any more!Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 10
  11. 11. Provenance and Versioning on Collection LevelKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 11
  12. 12. Overlapping Resource DescriptionsKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 12
  13. 13. Crosswalk to EDM Are we still backwards comaptible? YES :-)Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 13
  14. 14. PublishingKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 14
  15. 15. Whats inside our store? RDF Datasets per Collection, organized in Named Graphs. NG URI scheme: http://data.dm2e.eu/data/collection /[provider]/[collectionId]/[version] Additional provenance statements for each collection.Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 15
  16. 16. Make it available Web-Documents (with URI) deliver RDF, provenance is included as statements about the URI. On client side, the document creates a new Named Graph, with the URI as name.Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 16
  17. 17. RESTful API (Publishing) http://data.dm2e.eu/data/... ... collection/[provider]/[collectionID]/[version] (dm2e:Collection) => dump of one whole ingested dataset ... resource/[provider]/[collectionId]/[identifier] (edm:providedCHO) => 303 to latest version of describing Aggregation ... collection/[provider]/[collectionID]/[version]/[identifier] (ore:Aggregation) => data about a single resource ... linkset/[provider]/[linksetID]/[version] (dm2e:LinkSet) => generated links ... linkset/[provider]/[linksetID]/[version]/[provider]/ [collectionID]/[identifier] (dm2e:LinkAggregation) => links for a specific resourceKai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 17
  18. 18. Provenance in Documents Generated from provenance information about datasets: dc:creator => Data provider dc:date => Timestamp dm2e:version => version number dm2e:nextVersion => link to next version of the document dm2e:previousVersion => link to previous version dm2e:links => link to a linkset Optional: PROV statements for full provenance chain. Maintained by the DM2E infrastructure. Version means always the version of the underlying dataset.Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 18
  19. 19. Consuming our Data (WP1 and WP3) Fetch data from our URIs. Fetch suitable linksets from our URIs (links provided with data). Local data cleansing (recommended for WP3): Unify all URIs based on owl:sameAs links for better local querying (or use reasoning). Client has to maintain a mapping for original URIs per Named Graph for the proper representation of annotations.Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 19
  20. 20. Annotations Annotations always on resource or statement level. Subject of an annotation: [Graph URI]/URLencode(resource URI) or [Graph URI]/URLencode(subject,predicate,object) Example: http://data.dm2e.eu/data/collection/[provider]/ [collectionID]/[version]/[identifier]/ [subject,predicate,object] Similar to XPointer, SharedCanvas, ...Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 20
  21. 21. What is missing? Several definitions of dm2e: terms. A vocabulary for the annotations: Requirements of the scientists Mark wrong statements! Comment on statements. ... A vocabulary for the classification of linksets: Automatically created Manually created Recall-oriented (exploratory, but with wrong links) Precision-oriented (incomplete, but high quality) ... ...Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 21
  22. 22. Implementation pending ;-) Questions? Suggestions?Kai Eckert: A Linked Data based Infrastructure for DM2E (All WP Meeting, November 30th, 2012, Vienna) 22

×