MultimediaN
Pilot E-Culture
2
Pilot E-Culture
 Partners: VU, UvA, CWI,
DEN, ICN
 Subproject of MultimediaN,
a 16 MEuro project on
multimedia technol...
3
4
Hypothesis
 Semantic Web technology is in particular
useful in knowledge-rich domains
or formulated differently
 If we...
5
Use case: painting style
Find paintings of
a similar style
KLIMT, Gustav
Portrait of Adele Bloch-
Bauer I
1907
Oil and g...
6
How can we find this other
‘Art nouveau’ painting?
MUNCH, Edvard
The Scream
1893
Oil, tempera and pastel on
cardboard
91...
7
Issues w.r.t. the use case
 Parse annotation to find matches with thesauri
terms
– E.g. match artists to ULAN individua...
8
Natural-lang proc.
automatic annotation
text stings → concepts
Distributed
cultuurwijzer.nl collections
OAI-based access...
9
Architecture
10
Use of thesauri
 RDF/OWL data models of Getty thesauri
– Issues: scope, preserving structure
 WordNet: W3C SWBPD work...
11
Distributed vs. centralized
collection data
 Minimal requirement: collection object has
image URI
 Preference for ext...
12
Search strategies
 Basic search: keyword-oriented
 Advanced search:
– Tweaking default search parameters
– Time-relat...
13
Keyword search with
semantic clustering
1. Btree of literals plus Porter stem and
metaphone index
2. Find resources wit...
14
Demonstrator
15
Search: WordNet patterns that increase
recall without sacrificing precisions
(Hollink)
16
Triple statistics
17
Status
 4-year project, now in month 18
 Short-term goals:
– Adding more ethnological collections
– Location-oriented...
18
Issues
 Getting access to collections is mainly a social
process
– There is usually no principled objection to make da...
19
On-line demo
http://e-culture.multimedian.nl
Upcoming SlideShare
Loading in …5
×

E-Culture semantic search pilot

204 views

Published on

Seminar, Staford Medical Informatics, August 2006

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
204
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

E-Culture semantic search pilot

  1. 1. MultimediaN Pilot E-Culture
  2. 2. 2 Pilot E-Culture  Partners: VU, UvA, CWI, DEN, ICN  Subproject of MultimediaN, a 16 MEuro project on multimedia technology funded by the Dutch government  Aim: demonstrate added value of Semantic Web techniques for virtual heritage collections
  3. 3. 3
  4. 4. 4 Hypothesis  Semantic Web technology is in particular useful in knowledge-rich domains or formulated differently  If we cannot show added value in knowledge-rich domains, then it may have no value at all
  5. 5. 5 Use case: painting style Find paintings of a similar style KLIMT, Gustav Portrait of Adele Bloch- Bauer I 1907 Oil and gold on canvas 138 x 138 cm Austrian Gallery, Vienna
  6. 6. 6 How can we find this other ‘Art nouveau’ painting? MUNCH, Edvard The Scream 1893 Oil, tempera and pastel on cardboard 91 x 73.5 cm National Gallery, Oslo
  7. 7. 7 Issues w.r.t. the use case  Parse annotation to find matches with thesauri terms – E.g. match artists to ULAN individuals  Artists-style links – AAT contains styles; ULAN contains artists, but there is no link • Learn link from corpora • Derive it from other annotations – Domain-specific rules/reasoning needed • see example in SWRL doc • Painters may have painted in multiple styles
  8. 8. 8 Natural-lang proc. automatic annotation text stings → concepts Distributed cultuurwijzer.nl collections OAI-based access Reasoning support time/space reasoning Web interface support for web collections Presentation facilities semantic presentation device-specific Interoperability XML/RDF/OWL Scalability > 10,000,000 triples Ontologies WordNet, AAT, TGN ULAN, Dutch labels Search strategies sibling search semantic distance Dublin Core specializations dumb-down semantic annotation DIGITAL HERITAGE COLLECTIONS semantic search BASELINEENHANCEDENHANCED FEATURESFEATURES NEWNEW FEATURESFEATURES
  9. 9. 9 Architecture
  10. 10. 10 Use of thesauri  RDF/OWL data models of Getty thesauri – Issues: scope, preserving structure  WordNet: W3C SWBPD work http://www.w3.org/TR/wordnet-rdf/  Multilingualism – Dutch version of AAT  Existing collection metadata are parsed to find matches in thesauri (e.g. creator name => ULAN entry)
  11. 11. 11 Distributed vs. centralized collection data  Minimal requirement: collection object has image URI  Preference for external metadata, accessed through protocol such as OAI  In practice, external metadata access is still cumbersome
  12. 12. 12 Search strategies  Basic search: keyword-oriented  Advanced search: – Tweaking default search parameters – Time-related queries  Faceted search  Relation search – How are two URIs related?
  13. 13. 13 Keyword search with semantic clustering 1. Btree of literals plus Porter stem and metaphone index 2. Find resources with matching labels • Default resources are “Work”s 1. Find related resources by one-way graph traversal • owl:inverseOf is used • Threshold used for constraining search 1. Cluster results (group instances)
  14. 14. 14 Demonstrator
  15. 15. 15 Search: WordNet patterns that increase recall without sacrificing precisions (Hollink)
  16. 16. 16 Triple statistics
  17. 17. 17 Status  4-year project, now in month 18  Short-term goals: – Adding more ethnological collections – Location-oriented presentation – User studies with professional users (museum people) and interested lay persons – Multi-lingual interface (English, Dutch, Indonesian)
  18. 18. 18 Issues  Getting access to collections is mainly a social process – There is usually no principled objection to make data, metadata and thesauri publicly available, but it still feels threatening  Cultural heritage is a good area for a Semantic Web “island”: – lots of domain-specific knowledge – strong application pull – enormous amount of existing annotations, which have been built up over centuries
  19. 19. 19 On-line demo http://e-culture.multimedian.nl

×