Linked Open Europeana: Semantics for the Citizen
Upcoming SlideShare
Loading in...5
×
 

Linked Open Europeana: Semantics for the Citizen

on

  • 2,008 views

 

Statistics

Views

Total Views
2,008
Views on SlideShare
2,007
Embed Views
1

Actions

Likes
1
Downloads
32
Comments
0

1 Embed 1

http://paper.li 1

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • The current data model of Europeana are the “Europeana Semantic Elements” (ESE). ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a “flat”, Dublin-Core like representation. This is a “simple and robust” approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community. The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data. The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an „application profile“ of EDM. That means that all ESE data in Europeana will be still compatible with the new system.
  • The current data model of Europeana are the “Europeana Semantic Elements” (ESE). ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a “flat”, Dublin-Core like representation. This is a “simple and robust” approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community. The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data. The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an „application profile“ of EDM. That means that all ESE data in Europeana will be still compatible with the new system.
  • EDM re-uses three ontologies all of which are defined as a RDFS model. SKOS SKOS is an ontology to model KOS (vocabularies) in the Semantic Data Layer of Europeana. It specifically enables cross-vocabulary matching between concepts. Dublin Core Dublin Core is used to describe the core features of culture objects. ESE uses “old” Dublin Core Element Set. EDM uses “new” Dublin Core Metadata Terms which are specializations of the 15 “old” Dublin Core Elements. The use of DC Terms ensures backward compatibility to ESE. OAI ORE The typical record about an object provided to Europeana will included several information pieces: e.g. with descriptive metadata, views (thumbnails, video files, audio files, text documents etc.), links to landing pages etc. OAI ORE allows us to group and organize these information pieces: the abstract “provided object” (Object), the descriptive metadata (Proxy), any “view” of the provided object (Digital Representation).
  • Mona Lisa as described and depicted by the French ministry of culture (Directions des musees de France)
  • This is the metadata record of the French ministry of culture modeled in EDM. Each bubble represent a resource. In the bubble you have the class of the resource (its type) in italics and beneath the URI of the resource which identifies it. The arrows are the semantic links (the properties) between the resources. If there are two properties then the one below is the sub-property of the other one with a more specific meaning. First we have the Aggregation node which groups together all information pieces delivered by the Ministry. It aggregates the node representing the physical object “Mona Lisa”, the digital representations of the Mona Lisa, and the proxy node which is specific to a given provider, and is used to represent the description of the provided object, as seen from the perspective of that specific provider. This is how every metadata record provided to Europeana will look like in its basic form. Why manage central nodes for provided objects? The ORE model says so: an ORE proxy has to be proxy for some "view- independent" resource. Users are looking for (real world) objects (the painting Mona Lisa) and not for the specific view on it of Louvre, or Jaconde (of which they normally do not know anyway). So the approach is: Find the object first (PhysicalThing) and then proceed to the specific views on it. This is also the LOD approach.
  • The EDM wants to preserve the original information perspective of a provider on his data as much as possible. The ability to create sub-classes and sub-properties with RDFS is a crucial aspect. For this purpose EDM provides a range of generic properties and classes as anchors to which more specialized properties can be connected by the providers. This is called mapping . Example: The EDM property “ens:hasMet” is used to relate an object to the various things (persons, places, etc.) which somehow participated in its history. Here the provider mapped his more specific property “formerOwner” to “hasMet” and thereby specifying the actual relation of Francois I. to the Mona Lisa painting. This co-existence of the generic and the specific level allows for example: to search for the painting using a generic description-based index to display the information for that painting using the finer-grained distinctions made by the provider. There might relatively wide semantic gaps between the EDM property and a sub-property provided (e.g. ens: hasMet and ex1:schema/formerOwner). Europeana expects communities to agree on application profiles in order to minimize such gaps and to implement functions building on and exploiting such contributions.
  • Europeana wants to contextualize and enrich its objects by linking them to resources which contain additional knowledge. This enables richer functions, such as query expansion (e.g., using alternatives for a creator's name), recommendation of objects using semantic relations between them (objects created by connected artists), etc. This is the same Proxy from the slide before but now all the string values are converted to resources and typed. For example the subject of the painting Mona Lisa “femme” is now a resource typed as a concept and with the english and french spelling of the concept attached taken from a KOS in the Semantic Data Layer. And in the same KOS we could also properly find the broader term for this concept. Furthermore we could semantically align the concept femme with the concept femme in the Wikipedia (LOD cloud) and take all the information available there for this specific subject, including the many translations of the term itself. To increase the data value of its objects.
  • What we looked at so far can be understood as object-centric modeling. The second general modeling approach is event-centric which tries to tell a story about the object’s history. For this purpose EDM provides a simple “event-centric core” of one class and three properties: ens:Event: hub for event descriptions ens:wasPresentAt, holding between any resource and an event it is involved in; ens:happenedAt, holding between an event and a place; ens:occurredAt, holding between events and the time spans during which they occurred. This is to give you an impression of what is possible without going into details.
  • This is a (more or less fictional) example of three records about a translation of Edgar Allan Poe’s “ The Narrative of Arthur Gordon Pym of Nantucket ” to french: Record from BNF about an edition from 1868 Record from Gallica about an edition from 1868 (which offers a digital version of the book online: this the WebResource) Record from BNF with an edition from 2007 A few things I want to point out: Two records about the same thing and both point to the same object of interest, the 1868 edition. The user will look for this edition and not for the specific view of Gallica or BNF on this edition. So this node is the point of entry from which a user will proceed to a specific view on the object. It is also apparent now why Proxies for the descriptive metadata are helpful: Because this way we can keep the two views on the 1868 edition distinct. Finally the link „isDerivativeOf“ is an example of an inter-object link. So, for example, if a user found the 2007 edition he will be also hinted to the digital version of the 1868 edition in Gallica. With respect to FRBR one could start discussing now what and where is the work, expression, manifestation, and item here. Although the development of the EDM has been inspired by FRBR it is not implemented yet. That will happen after 2011.
  • EDM is still under development, and will continue to be refined until the end of 2010. It will be implemented during 2011, in the lead up to the Danube release of Europeana. Before, during and after the implementation of EDM, data that is compliant only with ESE will continue to be accepted. EDM is compatible with ESE and no data will need to be resubmitted. Europeana will make available a converter, and any provider who wishes to resubmit data, in order to increase its richness within Europeana, will be able to do so if they wish but will be under no obligation. How will EDM data be delivered to Europeana? Providers will have to create mapping to EDM and deliver it alongside their data which ideally are metadata records properly linked (IDs) to a vocabulary. The data has to be in XML or RDF. From this Europeana will create EDM data which includes enrichments and linking to external resources (vocabularies in the semantic data layer and/or the LOD cloud). Prototyping? At the end of the year we will start to produce first EDM data for the productive version of Europeana. This data will be taken from existing ESE data and from rich data delivered to Europeana by then.
  • First a few words about the envisioned information architecture of Europeana: This is how the information space of Europeana will be restructured : At the “bottom” we have the objects which are provided to Europeana. Above we have the “Semantic Data Layer” which is new. It contains various kinds of KOSs with knowledge about people, places, concepts, and so on. These concepts are linked to the objects below and thereby contextualize and enrich them.
  • The data provided to Europeana will come from many different kinds of domains like libraries, archives, or museums. They all will provide their specific collections and KOSs . That will naturally result in „isles of information“ . In order to make the data interoperable the concepts of the various KOSs in the Semantic Data Layer will be aligned , that means they will be connected via cross-vocabulary links . This technically enables applications to navigate through a semantic layer of concepts from different sources and to use it to access objects which are originally described by different but semantically related concepts.
  • Europeana intends to connect to the Linked Open Data community. In the Linked Open Data cloud we find many more knowledge sources like Dbpedia, Geonames, or Library of Congress Subject Headings. Europeana wants to use them to further contextualize and enrich the objects in its information space. At the same time Europeana wants to make its own data available to other communities. The EDM is crucial for realizing this vision. [ LOD cloud July 2009 ]
  • Hier könnte ein Exkurs zu RTP Doc ansetzen, wenn ich mehr als 20 Minuten Zeit hätte
  • Hier könnte ein Exkurs zu RTP Doc ansetzen, wenn ich mehr als 20 Minuten Zeit hätte
  • Hier könnte ein Exkurs zu RTP Doc ansetzen, wenn ich mehr als 20 Minuten Zeit hätte

Linked Open Europeana: Semantics for the Citizen Linked Open Europeana: Semantics for the Citizen Presentation Transcript

  • Linked Open Europeana: Semantics for the Citizen Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library and Information Science [email_address]
  • Overview Re: David: From 'oral' to 'written' (Plato) – from 'written' to 'digital' (David) – changing cultural techniques are threatening
    • Linked Open Data: What is it, how does it work?
    • How does it relate to the Semantic Web ?
    • Linked Open Europeana: how will it work in Europeana
    • … for the citizen: scholars, pupils and teachers, tourists, politicians, entrepreneurs
    • Conclusion : on the importance of being 'open'
    .
  • Taking Semantic Web to Helsinki ...
    • Apologies first: I am aware of speaking in a country that named Tim Berner-Lee for its first Millennium Technology Prize as early as 2004
    • Delivering this talk in such a place is courageous …
    • It may be equivalent of
      • … selling snow to eskimos
      • … carrying coals to Newcastle
      • … bringing beer to Munich
    • And I've done it before: some of you may remember me delivering a talk on Semantic Web issues here in Helsinki back in 2004!
    • -> Thank you for the invitation, be humble, and quote Tim Berners-Lee excessively
    Tarja Halonen & TBL
  • The Web of Documents Information Management: A Proposal (TBL, 1989)
  • Resources and Links in the Document Web
    • We have HTTP URIs to identify resources and links between them – but we are missing a few things!
    • What kinds of resources are 'Louvre.html' and 'LaJoconde.jpg'?
      • A machine cannot tell.
      • Humans can: we recognize implied context!
    • How exactly do they relate to each other?
      • A machine cannot tell.
      • Humans can: again we recognize implied context!
  • Syntactically Extending the Document Web (1)
    • We add a syntax for making statements on resources: RDF
    • Or, more generally triples ...
    • … where S and P are web resources (identified using URIs) and O is either a web resource or a literal
  • Syntactically Extending the Document Web (2)
    • We add a schema language (RDFS) with elements such as
      • classes,
      • hierarchies of classes and properties,
      • inheritance
      • support for basic inferencing.
    • And thus are able to establish structures in triple aggregations resulting in lightweight domain ontologies:
  • The Web of Things … Somewhat Mistaken Taken from Ronald Carpentier's Blog at http://carpentier.wordpress.com/ 2007/08/08/1-2-3/ What's wrong with this picture?
  • … and the Way we extend the Web in scope to make it a 'Web of Things'
  • And we get … Linked Data Copyright © 2008 W3C (MIT, ERCIM, Keio) http://www.w3.org/2008/Talks/0617-lod-tbl/#(4)
  • LoD … and the Semantic Web?
    • “Semantic Web done right ” (TBL, http://www.w3.org/2008/Talks/0617-lod-tbl/#(3 ))
    • -> What was wrong about the Semantic Web in 2007?
      • Artificial Intelligence heritage (agents, heavy logic)
      • Mostly corporate, inhouse applications
      • Little visibility on the WWW (“Where's the Web in the SW?” Frank van Harmelen, 2006)
      • Misuse of the attribute 'semantic'
    • „ I called this graph the Semantic Web, but maybe it should have been Giant Global Graph !“ (TBL, http://dig.csail.mit.edu/breadcrumbs/node/215 )
    • => Linked Open Data extends the Web of documents in syntax and scope without falling back into the mistakes of Artificial Intelligence. Future extensions may well grow into a truly 'semantic' web … ( ≠Web 3.0)
  • The Europeana Data Model: Making Europeana Part of Linked Open Data Partially based on Martin Doerr, Stefan Gradmann, Steffen Hennicke, Antoine Isaac, Herbert Van de Sompel: The Europeana Data Model (IFLA 2010)
  • Pre-EDM This made V. Reding promise a „European Digital Library“ in 2005
    • ESE
      “ Europeana Semantic Elements” (ESE)
    • Created for 2008 version of Europeana
    • enforces interoperability by converting datasets to a Dublin-Core like “flat” representation
    • “ simple and robust” but:
      • original metadata is not visible anymore
      • no specializations to finer-grained models
      • no connections to external (open data) resources
    • Probably shouldn't have been called “semantic” :-)
    • EDM
      “ Europeana Data Model” (EDM)
    • destined to replace ESE with the 2011 release of Europeana
    • ESE “application profile” of EDM (backwards compatibility)
    • preserves original data while still allowing for interoperability
    • allows for Semantic Web representation
    • EDM and other standards
    • Simple Knowledge Organization System ( SKOS )
      • Models the KOSs in the Semantic Data Layer of Europeana.
      • Allows for matching between KOSs.
    • DCMI Metadata Terms
      • Used for a core of semantically interoperable properties for descriptive metadata about an object.
      • Ensures backwards compatibility to ESE.
    • Open Archives Initiative Object Reuse & Exchange ( OAI ORE )
      • Organizes the metadata about an object in Europeana:
        • Provided Object: Represents the described object of interest.
        • Digital Representation: Some digital view of the object.
        • Proxy: The description of the provided object from one given perspective.
        • Aggregation: Groups all information pieces together.
    • Mona Lisa: French Ministry of Culture
    • Metadata Record in EDM
      Proxy
      Aggregation
      Digital Representations
      Object of Interest
    • Different Semantic Grains
    • Keep data expressed as close as possible to original model.
    • Using mappings to more interoperable level: the EDM.
    • Semantic Enrichment
      ens:Agent : persons or organizations ens:Place : spatial entities
      ens:TimeSpan : time periods or dates skos:Concept : entities from KOS
    • Event-Centric Modeling
    Preserving and exploiting original data also means being compatible with descriptions beyond simple object level
    • Complex Objects
    • Part-whole links for complex (hierarchical) objects
    • Order among parts of objects
    • Derivation and versioning relations
    • Current State of EDM
    • Confirmed feasibility in community workshops (archives, libraries, audiovisual archives, museums).
    • The EDM is now closely articulated with the 'Danube' requirement process (Europeana release 2011).
    • We're in the course of prototyping on a larger scale.
    • EDM Specifications and Primer:
      • http://version1.europeana.eu/web/europeana-project/technicaldocuments/
    • EuropeanaLabs:
      • http://europeanalabs.eu/
    • … and LoD
  • An Aggregation ...
  • … some context
  • … more context
    • … and the Big Picture: The Semantic Data Layer
    • The Semantic Data Layer
        library
        archive
        museum
        Bridging „isles of information“ by connecting objects from different domains via cross-vocabulary links.
      • EDM and Linked Open Data
        Europeana Information Space
        Context Data
      • DBpedia
      • PND and SWD (prototype)
      • Geonames
      • LCSH
    • 'Beyond Catalogues and Records' generates new questions!
      • Where do resource aggregations 'start'? Where do they 'end'?
      • And what constitutes document boundaries??
      • And which node was connected to which one at a given time???
      A B C
    • … and new opportunities: Triple Sets and Reasoning (1)
    • Triple Sets and Reasoning (2)
    • Triple Sets and Reasoning (3) -> Potential of novel digital heuristics!
    • An Example (1)
    • An Example (2)
    • An Example (3)
    • An Example (4)
    • An Example (5)
    • An Example (6)
    • An Example (7) Adam & Eve ∞
      • The Datacloud Behind the Example
      • Semantic Exploration, Context Discovery and Knowledge Generation:
      • Semantics for the Citizen
    • What if ...
      • Europeana EDM is fully implemented and data migrated.
      • Data originating from public bodies (PSI) are available as LoD
        • As in http://data-gov.tw.rpi.edu/wiki :
      • And we combine all this with the rest of the Cloud: what use to make of all this?
    • SwickyNotes: Ontology Based Annotation as Linked Open Data
    • “ Cretans are always Liars” … annotated
    • Perseus
    • “ Cretans are always Liars” … in Perseus!
    • -> Lidell-Scott … and further!
      • -> Isidore
      • -> Europeana
      • -> National Digital Library of Finland
      • -> Wordnet, OpenCalais, Geonames …: into context!!!
    • … for the Citizen: Examples (1) Tourists and the like
      • Helsinki: Plan my museum day in Helsinki - but avoid long walks and expensive places, expensive being everything > 10 € -> Europeana LoD + http://aikataulut.ytv.fi/reittiopas/en/ (LoD version) + http://www.hel.fi/hki/helsinki/en/Services/Culture+and+libraries/Museums+and+on+exhibit (LoD)
      • Berlin: I'd like to spend a day in places connected to Alexander von Humboldt -> Europeana LoD + http://www.fahrinfo-berlin.de/Fahrinfo/bin/ (LoD) + http://dbpedia.org/page/Alexander_von_Humboldt
      • Berlin: Take me to places that were important in the context of the public uprise 17 th of June 1953 (by bus, please. And mind all the streets that have been renamed after 1990!) -> Europeana + http://www.fahrinfo-berlin.de/Fahrinfo/bin/ (LoD) + http://www.17juni53.de/karte/berlin.html (LoD)
      • Italian perspective: show me everything the French have taken away from us (not just La Joconde) -> Europeana LoD (provenance info!)
    • … for the Citizen: Examples (2) Teachers and pupils
      • Find surrealist paintings from France, Spain and Italy – is there a significant difference between the artworks produced in those countries? -> Europeana Portal (country facet)
      • Document the post-war history of your hometown -> relevant local PSI resources + Europeana LoD
      • Find jazz music by bands from London under CC SA -> ( BBC Music, MusicBrainz + Europeana LoD )
      Business people
      • Is there a significant market for access to audiovisual objects of art in France? -> Europeana LoD + PSI from INA
    • … for the Citizen: Examples (3) Politicians
      • To what extent did funding from eContent+ stimulate content digitisation activities over the past 10 years in EU member states? -> Europeana Portal (provenance information)
      • Does it actually save money after all to close down the Historic Museum in Hamburg Altona? -> Europeana (Logfiles) + relevant local PSI resources (relating cultural resources to financial data)
      • What is the Finnish contribution to Europeana – and what degree of attention did this contribution generate for our country? -> Europeana Portal + Logfiles
      • … and the political bit
    • On the Importance of being 'Open' (1)
      • “ Openness (allowing access) is separate question.” (TBL, http://www.w3.org/2008/Talks/0617-lod-tbl/#(22))
      • Does Linked Data work without being 'open' ?
        • Technically speaking: yes (cf. pharma industry or biomedical data)
        • But it gets horribly expensive that way …
        • … much too expensive, probably, for Europeana to afford!
        • And much of its 'semantic' charms would be lost in such a setting, anyway.
    • On the Importance of being 'Open' (2)
      • This has a number of implications
        • No control over data usage
        • No income to be generated from data access and use
        • Innovative and (commercially) attractive services can be built on LoD
      • -> Do not repeat mistakes we are very familiar with from the Open Access debates of the past 10 years!
        • 'open' vs. 'free',
        • 'freen' vs. 'commercial'
      • -> Do not exclude commercial reuse for Europeana metadata!
      • -> What is the actual value of context (in business terms!)?
    • Selected Reading
      • Martin Doerr, Stefan Gradmann, Steffen Hennicke, Antoine Isaac, Carlo Meghini, Herbert van de Sompel: The Europeana Data Model. IFLA 2010 (Gothenburg). Session on „Libraries and the Semantic Web“. http://www.ifla.org/files/hq/papers/ifla76/149-doerr-en.pdf
      • Stefan Gradmann: Knowledge = Information in Context: on the Importance of Semantic Contextualisation in Europeana. Europeana White Paper 1. http://www.scribd.com/doc/32110457/Europeana-White-Paper-1
      • A lot more mercifully skipped ...
    • Questions?