Challenges on modeling annotations
in the Europeana Sounds project
Hugo Manguinhas, Sergiu Gordea, Antoine Isaac, Alessio Piccioli,
Giulio Andreini, Francesca Di Donato, Remy Gardien, Maarten
Brinkerink | iAnnotate 2016
What is Europeana?
Challenges on modeling annotations in the Europeana Sounds project
CC BY-SA
We aggregate metadata:
• From all EU countries
• 3,500 galleries, libraries,
archives and museums
• More than 52M objects
• In about 50 languages
Europeana aggregation infrastructure
Europeana| CC BY-SA
The Platform for Europe’s Digital Cultural Heritage
Why are annotations useful?
CC BY-SA
For Users, a means to…
• Contribute with their knowledge
• Discuss and share their knowledge with others
For Cultural Institutions, a new way and opportunity to
increase the quality of their metadata
• Improve consistency
• Contribute to a better semantic description, with internal cross-
linking and links to the web of data
Challenges on modeling annotations in the Europeana Sounds project
The Europeana Sounds project
CC BY-SA
Europeana Sounds aims to increase the amount of audio
content available via Europeana
• also improving geographical and thematic coverage
Apart from aggregation, it improves discovery and use of
audio content, by enriching metadata through innovative
methods
Challenges on modeling annotations in the Europeana Sounds project
Annotation Scenarios in Europeana Sounds
CC BY-SA
A user annotates a Cultural Heritage Object, in particular…
• Information describing the object (i.e. metadata)
• Contextual information (i.e. metadata about Agents, Places, Subjects, … )
• Media resources representing the object
By the following actions:
• Tag with terms from controlled vocabularies
• Complete or correct information
• Favour or moderate annotations made by other users
• Comment and discuss with other users
• Relate objects together
Challenges on modeling annotations in the Europeana Sounds project
Crowdsourcing Infrastructure
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
Annotation Providers
Annotation Client
TheSession.org + TunePal HistoryPin.org Pundit WITH
Exchanging annotations across platforms
CC BY-SA
We adopted the W3C Web Annotation Data Model
• Offers a simple model for exchanging annotations across platforms
... but flexible enough to support complex scenarios
We are developing a REST API based on the W3C Web Annotation
Protocol
• Which developers & Europeana will use for retrieval, creation and
search of annotations
Challenges on modeling annotations in the Europeana Sounds project
Tagging with semantic resources
CC BY-SA
User scenario:
A end-user wishes to tag a
Europeana object using a
term/resource from a
controlled vocabulary
Challenges on modeling annotations in the Europeana Sounds project
Item as displayed in the
Europeana Collections Portal
Tagging with semantic resources
The Pundit use case
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
DBpedia
API
oa:Annotation
http://data.europeana.eu/an
notation/...
oa:hasBody
skos:Concept
http://dbpedia.org/resource/
Brass_instrument
oa:tagging
oa:motivatedBy
edm:ProvidedCHO
http://data.europeana.eu/item/09102/_UEDIN_214
oa:hasTarget
Available
Vocabularies
/ Datasets
Tagging with semantic resources
Open Questions
CC BY-SA
… some aspects may significantly influence user experience:
• Should different kinds of semantic resources be displayed in the
same way? ...or should they be differentiated by type (ie. a Place vs Agent)
or scope (e.g. Rock as a sound genre vs a physical thing).
• Which label(s) should be displayed? Should the one that best fits the
display settings (ie. language preferences) be used - what if no label
exists for that language?
• Should the user annotate with any term from a vocabulary, or only
a subset?
• What to do with annotations when a vocabulary is updated by its
maintainer?
Challenges on modeling annotations in the Europeana Sounds project
Tagging with semantic resources
Challenges
Client applications must:
• have all data necessary for feeding the display and
• in an way that they can uniformly process
This means that resources must be:
• Dereferenced
• Translated into an uniform data format
To tackle these challenges we chose the Europeana Data Model:
• Already in use at Europeana
• Reuses existing standards (e.g. SKOS, DublinCore, WGS84 Geo Positioning)
• Gives support for all contextual resources
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
Tagging with geographical information
The HistoryPin.org use case
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
User can specify the rough
coordinates and a radius
An Europeana Object displayed in
HistoryPin.org
Tagging with geographical information
The HistoryPin.org use case
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
{
"@context": "http://www.w3.org/ns/anno.jsonld",
"id": "http://data.europeana.eu/annotations/historypin/136290",
...
[provenance info here],
...
"motivation": "tagging",
"body": {
"@context": <Context for EDM>,
"type": "edm:Place",
"wgs84_pos:lat": "48.85341",
"wgs84_pos:long": "2.3488"
},
"target": "http://data.europeana.eu/item/09102/_UEDIN_214"
}
Similar as semantic tagging but
using a “virtual” resource
Coordinates expressed using
WGS84 Geo Positioning
Annotating metadata
The Pundit use case
CC BY-SA
We consider metadata annotations as…
• any annotation that refers to or asserts a statement to the
information describing an object in order to complete or correct it
Ideally, and like other annotations, they should be
• agnostic to the way they are presented to the user in the interface
• machine readable
So that metadata annotations can
• survive changes to the interface design;
• allow them to be easily shared outside the interface they were
originally created;
• allow for other software applications to take further advantage of it
Challenges on modeling annotations in the Europeana Sounds project
Annotating metadata
A Proposal
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
oa:Annotation
http://data.europeana.e
u/annotation/...
oa:describing?
oa:motivatedBy
pundit:MetadataSelector
#statement1
oa:SpecificResource
#metadata1
oa:hasTarget
oa:hasSelector
rdf:predicate
Graph
Correct URI
edm:ProvidedCHO
http://data.europeana.eu/item/09102/_UEDIN_214
dcterms:isPartOf
oa:hasSource
rdf:value
oa:hasBody
A specific
motivation may be
needed
Similar to a
rdf:Statement but
following WA guidelines
Favouring and moderating annotations
CC BY-SA
As manual per-annotation moderation does not scale well,
we wish to encourage a crowd-moderation policy among the
end-users:
• Three-strikes-out: if three users report an annotation as
in violation of the terms of use, it will be hidden.
How to differentiate moderation (violations of terms of use) from
up- and down-voting ('this is a very good annotation, +1')?
Challenges on modeling annotations in the Europeana Sounds project
Conclusion
• Requirements are becoming clearer as we work on more
concrete use-cases and validate them with real users
• Expressing cross-platform annotations in an uniform way is a
big challenge:
• W3C Web Annotation Data Model gives a good interoperable
base
• But, not all scenarios are yet covered
• Need for best practices for specific applications / domains
• Still a lot of work ahead...
...but we are making progress
CC BY-SA
Challenges on modeling annotations in the Europeana Sounds project
http://www.europeanasounds.eu/

Challenges on modeling annotations in the europeana sounds project

  • 1.
    Challenges on modelingannotations in the Europeana Sounds project Hugo Manguinhas, Sergiu Gordea, Antoine Isaac, Alessio Piccioli, Giulio Andreini, Francesca Di Donato, Remy Gardien, Maarten Brinkerink | iAnnotate 2016
  • 2.
    What is Europeana? Challengeson modeling annotations in the Europeana Sounds project CC BY-SA We aggregate metadata: • From all EU countries • 3,500 galleries, libraries, archives and museums • More than 52M objects • In about 50 languages Europeana aggregation infrastructure Europeana| CC BY-SA The Platform for Europe’s Digital Cultural Heritage
  • 3.
    Why are annotationsuseful? CC BY-SA For Users, a means to… • Contribute with their knowledge • Discuss and share their knowledge with others For Cultural Institutions, a new way and opportunity to increase the quality of their metadata • Improve consistency • Contribute to a better semantic description, with internal cross- linking and links to the web of data Challenges on modeling annotations in the Europeana Sounds project
  • 4.
    The Europeana Soundsproject CC BY-SA Europeana Sounds aims to increase the amount of audio content available via Europeana • also improving geographical and thematic coverage Apart from aggregation, it improves discovery and use of audio content, by enriching metadata through innovative methods Challenges on modeling annotations in the Europeana Sounds project
  • 5.
    Annotation Scenarios inEuropeana Sounds CC BY-SA A user annotates a Cultural Heritage Object, in particular… • Information describing the object (i.e. metadata) • Contextual information (i.e. metadata about Agents, Places, Subjects, … ) • Media resources representing the object By the following actions: • Tag with terms from controlled vocabularies • Complete or correct information • Favour or moderate annotations made by other users • Comment and discuss with other users • Relate objects together Challenges on modeling annotations in the Europeana Sounds project
  • 6.
    Crowdsourcing Infrastructure CC BY-SA Challengeson modeling annotations in the Europeana Sounds project Annotation Providers Annotation Client TheSession.org + TunePal HistoryPin.org Pundit WITH
  • 7.
    Exchanging annotations acrossplatforms CC BY-SA We adopted the W3C Web Annotation Data Model • Offers a simple model for exchanging annotations across platforms ... but flexible enough to support complex scenarios We are developing a REST API based on the W3C Web Annotation Protocol • Which developers & Europeana will use for retrieval, creation and search of annotations Challenges on modeling annotations in the Europeana Sounds project
  • 8.
    Tagging with semanticresources CC BY-SA User scenario: A end-user wishes to tag a Europeana object using a term/resource from a controlled vocabulary Challenges on modeling annotations in the Europeana Sounds project Item as displayed in the Europeana Collections Portal
  • 9.
    Tagging with semanticresources The Pundit use case CC BY-SA Challenges on modeling annotations in the Europeana Sounds project DBpedia API oa:Annotation http://data.europeana.eu/an notation/... oa:hasBody skos:Concept http://dbpedia.org/resource/ Brass_instrument oa:tagging oa:motivatedBy edm:ProvidedCHO http://data.europeana.eu/item/09102/_UEDIN_214 oa:hasTarget Available Vocabularies / Datasets
  • 10.
    Tagging with semanticresources Open Questions CC BY-SA … some aspects may significantly influence user experience: • Should different kinds of semantic resources be displayed in the same way? ...or should they be differentiated by type (ie. a Place vs Agent) or scope (e.g. Rock as a sound genre vs a physical thing). • Which label(s) should be displayed? Should the one that best fits the display settings (ie. language preferences) be used - what if no label exists for that language? • Should the user annotate with any term from a vocabulary, or only a subset? • What to do with annotations when a vocabulary is updated by its maintainer? Challenges on modeling annotations in the Europeana Sounds project
  • 11.
    Tagging with semanticresources Challenges Client applications must: • have all data necessary for feeding the display and • in an way that they can uniformly process This means that resources must be: • Dereferenced • Translated into an uniform data format To tackle these challenges we chose the Europeana Data Model: • Already in use at Europeana • Reuses existing standards (e.g. SKOS, DublinCore, WGS84 Geo Positioning) • Gives support for all contextual resources CC BY-SA Challenges on modeling annotations in the Europeana Sounds project
  • 12.
    Tagging with geographicalinformation The HistoryPin.org use case CC BY-SA Challenges on modeling annotations in the Europeana Sounds project User can specify the rough coordinates and a radius An Europeana Object displayed in HistoryPin.org
  • 13.
    Tagging with geographicalinformation The HistoryPin.org use case CC BY-SA Challenges on modeling annotations in the Europeana Sounds project { "@context": "http://www.w3.org/ns/anno.jsonld", "id": "http://data.europeana.eu/annotations/historypin/136290", ... [provenance info here], ... "motivation": "tagging", "body": { "@context": <Context for EDM>, "type": "edm:Place", "wgs84_pos:lat": "48.85341", "wgs84_pos:long": "2.3488" }, "target": "http://data.europeana.eu/item/09102/_UEDIN_214" } Similar as semantic tagging but using a “virtual” resource Coordinates expressed using WGS84 Geo Positioning
  • 14.
    Annotating metadata The Pundituse case CC BY-SA We consider metadata annotations as… • any annotation that refers to or asserts a statement to the information describing an object in order to complete or correct it Ideally, and like other annotations, they should be • agnostic to the way they are presented to the user in the interface • machine readable So that metadata annotations can • survive changes to the interface design; • allow them to be easily shared outside the interface they were originally created; • allow for other software applications to take further advantage of it Challenges on modeling annotations in the Europeana Sounds project
  • 15.
    Annotating metadata A Proposal CCBY-SA Challenges on modeling annotations in the Europeana Sounds project oa:Annotation http://data.europeana.e u/annotation/... oa:describing? oa:motivatedBy pundit:MetadataSelector #statement1 oa:SpecificResource #metadata1 oa:hasTarget oa:hasSelector rdf:predicate Graph Correct URI edm:ProvidedCHO http://data.europeana.eu/item/09102/_UEDIN_214 dcterms:isPartOf oa:hasSource rdf:value oa:hasBody A specific motivation may be needed Similar to a rdf:Statement but following WA guidelines
  • 16.
    Favouring and moderatingannotations CC BY-SA As manual per-annotation moderation does not scale well, we wish to encourage a crowd-moderation policy among the end-users: • Three-strikes-out: if three users report an annotation as in violation of the terms of use, it will be hidden. How to differentiate moderation (violations of terms of use) from up- and down-voting ('this is a very good annotation, +1')? Challenges on modeling annotations in the Europeana Sounds project
  • 17.
    Conclusion • Requirements arebecoming clearer as we work on more concrete use-cases and validate them with real users • Expressing cross-platform annotations in an uniform way is a big challenge: • W3C Web Annotation Data Model gives a good interoperable base • But, not all scenarios are yet covered • Need for best practices for specific applications / domains • Still a lot of work ahead... ...but we are making progress CC BY-SA Challenges on modeling annotations in the Europeana Sounds project
  • 18.