2009.09.29   chris poppe - metadata
Upcoming SlideShare
Loading in...5
×
 

2009.09.29 chris poppe - metadata

on

  • 754 views

Chris Poppe presents current and future metadata trends at a cultural heritage workshop.

Chris Poppe presents current and future metadata trends at a cultural heritage workshop.

Statistics

Views

Total Views
754
Views on SlideShare
752
Embed Views
2

Actions

Likes
0
Downloads
6
Comments
0

1 Embed 2

http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • MODS: Schema voor bibliografische set elementen die vooral worden gebruikt binnen bibliotheektoepassingen. Metadata Object Description Schema: MODS (Library of Congress)   -

2009.09.29   chris poppe - metadata 2009.09.29 chris poppe - metadata Presentation Transcript

  • Metadata - Aanknopingspunten, Prioriteiten, Toekomsperspectieven en Aantekeningen vanuit de Marge  Chris Poppe Multimedia Lab Department of Electronics and Information Systems Faculty of Engineering Ghent University
  • Multimedia Lab
    • Multimedia Lab
      • Research group of Ghent University (Faculty of Engineering)
      • Multimedia
        • Video!
          • Coding,
          • Processing
          • Transmission
          • Analysis
          • Adaptation
          • Annotation
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Outline
    • What is metadata?
    • Metadata vs. Tags?
      • Benefits/disadvantages?
    • What is a metadata standard?
      • Why is it needed?
      • How does it look like?
      • What are the problems?
    • What is the semantic web?
      • Web 2.0?
      • Web 3.0?
      • Semantic Web Technologies?
    • Conclusions
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Metadata
    • Data describing data
    • Museum for the history of sciences
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Metadata
    • Data describing data
    • Digital content
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Resolution Dpi Date/Time created Creator Camera used File format (JPG, BMP, GIF, PNG, …) Location shot (GPS) Copyright Title Genre Rating Comment Keywords Depicted event …
  • Use of Metadata
    • Understanding of multimedia content
    • Sharing
    • Management
    • Retrieval
      • Search
      • browse
    • Processing
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Metadata: tags
    • Tag
      • Free text annotation
      • Keywords, terms, comments
      • Informally
      • Personally
      • Started as taxonomies or vocabularies used to describe content
      • Evolved into folksonomies
        • User-driven
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Taxonomies
    • Top down
    • Pre-defined structure
    • Hierarchy
    • Controlled vocabularies
    • Expert
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Taxonomies
    • Example
      • Dewey Decimal Classification
      • Library classification
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Folksonomy
    • Folk + taxonomy
      • Free form text annotation
      • No predefined structure
      • No hierarchy
      • Users add metadata
      • Flat name space
      • Bottom up
    • Two types:
      • Broad
      • Narrow
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Broad Folksonomy
    • Tagging shared content
    • Anyone can participate
    • Examples
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Narrow Folksonomy
    • Tagging your own content
    • Tagging friend’s content
      • No consolidation
      • No emerging vocabularies
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Tagging usage
    • Navigation
      • Tag clouds
      • Organization
      • Hints
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Tagging howto?
    • Totally free
    • Semi-structured
    • Hinted
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Tagging problems
    • Cultural differences: Genghis Kahn, for some a hero, for others a criminal
    • Communities of users can give different meaning to tags: Movie vs. Film vs. Cinema
    • Language issues
    • Ambiguity
    • Misspelled tags (40% Flickr, 28% del.icio.us)
    • Semantics of tags
      • Factual tags: what is it about, what it is: ‘image’, ‘article’, ‘blog’,…
      • Subjective tags: user’s opinion: ‘funny’, ‘hot’, ‘stupid’,…
      • Personal tags: self reference: ‘toread’, ‘mycomments’, …
      • Tag: “nothing to do with Brussels”
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Metadata
    • Data describing data
    • Digital content
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Resolution Dpi Date/Time created Creator Camera used File format (JPG, BMP, GIF, PNG, …) Location shot (GPS) Copyright Title Genre Rating Comment Keywords Depicted event …
    • Compression and container formats
    • Standards for multimedia
      • Standards for metadata?
    Multimedia MP2 JPEG MPEG-2 MXF JPEG2000 AVI AAC H.264/MPEG-4 AVC PNG Motion JPEG2000 TIFF MP4 MPEG WAV FLAC VC-1 Ogg Vorbis DivX AIFF GIF JPEG-LS Matroska OGM/OGG Windows Media Audio DIRAC 3GP DV FLV Betacam Realmedia MOV AC-3/Dolby Digital Theora ASF TTA Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Video compression Audio compression Image compression Physical Containers
    • Standard which determines the structure of metadata
    Metadata Standard Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”> <namePart>Claus, Hugo</namePart> <namePart type=“date”>1929-</namePart> <role> <text>creator</text> </role> </name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS Metadata Object Description Schema Resolution Dpi Date/Time created Creator Camera used File format (JPG, BMP, GIF, PNG, …) Location shot (GPS) Copyright Title Genre Rating Comment Keywords Depicted event …
  • XML
    • XML (Extensible Markup Language)
      • Standardized by W3C (World Wide Web Consortium)
      • Language to define the structure of a document
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <? xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?> <!-- Dit is een boekenlijst. --> <boekenlijst> <boek categorie=&quot;thriller&quot;> <titel>Het Bernini Mysterie</titel> <auteur>Dan Brown</auteur> </boek> <boek categorie=&quot;woordenboek&quot;> <titel>Van Dale Frans-Nederlands</titel> <auteur /> </boek> </boekenlijst>
    • XML element
    • Attribute
    • values
  • XML Schema
    • XML Schema
      • Uses XML to denote the structure of a document
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    • XML schema
    • Elements:
        • Boekenlijst
        • Boek
        • Titel
        • Auteur
    • Order
    • Types (of values)
    Determines <? xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?> <!-- Dit is een boekenlijst. --> <boekenlijst> <boek categorie=&quot;thriller&quot;> <titel>Het Bernini Mysterie</titel> <auteur>Dan Brown</auteur> </boek> <boek categorie=&quot;woordenboek&quot;> <titel>Van Dale Frans-Nederlands</titel> <auteur /> </boek> </boekenlijst>
  • Metadata Standard
    • Describe structure of metadata using XML schema
    • Textual specification, explains semantics of the elements
      • titleInfo : “A word, phrase, character, or group of characters, normally appearing in a resource, that names it or the work contained in it. “
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS XML schema Determines
    • Shared information uses common structure
    • Standard software can be used to parse information
    Use of Metadata Standards Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 DB DB Speak same language <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS document <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS document <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS document
  • Metadata Standards
    • Different Metadata Standards exist!
      • Different domains
      • Different communities
      • Different formats
      • Different focus
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Problem Metadata Standards
      • Different Metadata standards can describe same thing
        • But in different way!!!
    Detection and Representation of Moving Objects for Video Surveillance Chris Poppe Ghent, Belgium – June 9 2009 < object id=“ 0 ” > < box xc=“ 77 ” yc=“ 73 ” w=“ 21 ” h=“ 16 ” /> </ object > Box: “Coordinates of the centre and the dimensions of the bounding box of a detected object in pixels.” metadata example 1 CVML (Computer Vision Markup Language) < LLID =“ LLID1 ” >< Mask > < BB mp7:dim = “ 4 ” > 67 65 88 91 </ BB > </ Mask > </ LLID > BB: “Coordinates of a rectangular segment.” metadata example 2 VS7 (Video Surveillance Schema)
  • Problems Metadata Standard
    • Current metadata standards define structure of metadata
    • Mappings are needed to use different standards within one system
    • Metadata standard does not solve everything!
      • For instance: DC creator property
        • Creator=“Shakespeare, William”
        • Creator=“William Shakespeare”
        • Creator=“Shakespeare”
        • Creator=“W. Shakespare”
      • Same problems as tagging can occur
    • Lack of ways to describe semantics of metadata
      • Currently plain text
      • Not machine readable
    • Multimedia content shifts to online repositories
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Semantic Web ?.0 Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • The Syntactic Web
    • Consider a typical web page:
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    • Mark-up consists of:
      • rendering information (e.g., font size and colour)
      • Hyper-links to related content
    • Semantic content is accessible to humans but not (easily) to computers…
  • Impossible (?) using the Syntactic Web…
    • Complex queries involving background knowledge
      • Give me the telephone number of the responsible person within Multimedia Lab of the demo about metadata applications
    • Locating information in data repositories
      • Travel enquiries
      • Prices of goods and services
      • Results of human genome experiments
    • Finding and using “ web services ”
      • Visualize surface interactions between two proteins
    • Delegating complex tasks to web “ agents ”
      • Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Semantic Web Technologies
    • Technologies developed by the World Wide Web Consortium (W3C)
    • Vision: the Web as universal medium for data, information and knowledge exchange
    • HTML, XML -> RDF, RDFS, OWL, …
    • Technologies to interconnect, exchange information
      • Applicable for metadata also!
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Why is XML not enough
    • http://www.w3.org/DesignIssues/RDF-XML.html (Tim Berners-lee)
    • Try to express “The author of the note is Tim” in XML
    • For a person, the three representations means the same, but NOT for a machine!
      • XML contains structures only, no semantics
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <author> <uri>note</uri> <name>Tim</name> </author> <document href= &quot; note&quot;> <author>Tim</author> </document> <document uri=&quot;note&quot; author=&quot;Tim&quot; />
  • RDF
    • RDF (Resource Description Framework)
    • Triples: subject – predicate – object
    • URI to identify resources
    • “ The author of the note is Tim”
    • Serialization in XML:
    • <rdf:RDF xmlns:rdf= http://www.w3.org/1999/02/22-rdf-syntax-ns# >
    • <Note rdf:about= http://www.example.org/#note >
    • <hasAuthor rdf:resource=&quot;http://www.example.org/#Tim”/>
    • </Note>
    • </rdf:RDF>
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Note Tim hasAuthor
  • RDFS
    • RDF Schema
    • Standardized vocabulary for describing concepts
    • Introduces classes and instances
    • Subclasses, sub properties
      • Possible to define hierarchies!
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Note1 Tim hasAuthor ClassNote Class Person type type
  • OWL
    • Web Ontology Language, W3C recommendation (2004)
    • Provides richer vocabulary
    • Define advanced relations
      • Data typing
      • Cardinalities
      • Rich typing of properties
    • Example:
    • Allows for intelligent reasoning
    • Complex ontologies can be created
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Note1 Tim hasAuthor ClassNote Class Person type type isAuthorFrom <owl:ObjectProperty rdf:ID=“ isAuthorFrom ”> <owl:inverseOf rdf:resource=“#hasAuthor”> </owl:ObjectProperty>
  • Ontology
    • Information in a domain is structured using an ontology
        • a data model that represents a set of concepts and relations amongst these concepts within a specific domain
    • Thesaurus
      • Dictionary
        • Synonyms
    • Taxonomy
      • Hierarchy
        • Subclass and siblings
    • Ontology
      • concepts
      • relations
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Ontology (using OWL)
    • Example: ontology for domain of science
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 subClassOf birth date DatatypeProperty Person Class: Person Class: Scientist Scientist Individual birth date “ 14/10/1801”
    • OWL constructs
    • Class
    • DatatypeProperty
    • subClassOf
    • Individual
    “ Joseph Plateau”
  • Semantic Web Technologies
    • SPARQL Protocol And RDF Query Language (SPARQL)
      • SQL-like language for RDF
      • Example: search for all the notes of Tim
        • SELECT ?x WHERE ?x hasAuthor Tim
    • Rule Interchange Language (RIF)
      • Example rule: if Tim is the author of the note, he is also a contributor
      • goal is to create an interchange format for different rule languages and inference engines
      • closely related to ontologies
        • rules combine information and derive new information on top of ontologies
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Semantic Web Technologies
    • Data on the web can be linked to each other
      • Example: ontology on Brussels
        • DBpedia.org
      • Browsing:
        • Brussels ->cityofbirth -> Raymon_Goethals -> managerclubs -> RSC Anderlecht …
      • Querying: find all people born in Brussels before 1930
      • Reasoning: if a person was born in Brussels, he was also born in Belgium
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Semantic Web Technologies Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Semantic Web ?.0 Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
  • Conclusions
    • Use metadata standards!
      • Allows interchange
      • Structures the metadata
    • When no standard is sufficient
      • Apply proprietary format
      • Structures the metadata
    • If tagging is needed for search/browsing/retrieval
      • Provide fixed structure
        • E.g., who, what, where, when, …
      • Provide fixed vocabulary
        • Thesaurus
        • Hierarchy
        • Ontology for advanced reasoning
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    • Questions?
    Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009