Metadata - Chris Poppe - MMLab/IBBT

897 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
897
On SlideShare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • MODS: Schema voor bibliografische set elementen die vooral worden gebruikt binnen bibliotheektoepassingen. Metadata Object Description Schema: MODS (Library of Congress)   -
  • Metadata - Chris Poppe - MMLab/IBBT

    1. 1. Metadata - Aanknopingspunten, Prioriteiten, Toekomsperspectieven en Aantekeningen vanuit de Marge  Chris Poppe Multimedia Lab Department of Electronics and Information Systems Faculty of Engineering Ghent University
    2. 2. Multimedia Lab <ul><li>Multimedia Lab </li></ul><ul><ul><li>Research group of Ghent University (Faculty of Engineering) </li></ul></ul><ul><ul><li>Multimedia </li></ul></ul><ul><ul><ul><li>Video! </li></ul></ul></ul><ul><ul><ul><ul><li>Coding, </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Processing </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Transmission </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Analysis </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Adaptation </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Annotation </li></ul></ul></ul></ul><ul><ul><ul><ul><li>… </li></ul></ul></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Metadata XML technologies Semantic Web Technologies Standardization Metadata management …
    3. 3. Outline <ul><li>What is metadata? </li></ul><ul><li>Metadata vs. Tags? </li></ul><ul><ul><li>Benefits/disadvantages? </li></ul></ul><ul><li>What is a metadata standard? </li></ul><ul><ul><li>Why is it needed? </li></ul></ul><ul><ul><li>How does it look like? </li></ul></ul><ul><ul><li>What are the problems? </li></ul></ul><ul><li>What is the semantic web? </li></ul><ul><ul><li>Web 2.0? </li></ul></ul><ul><ul><li>Web 3.0? </li></ul></ul><ul><ul><li>Semantic Web Technologies? </li></ul></ul><ul><li>Conclusions </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    4. 4. Metadata <ul><li>Data describing data </li></ul><ul><li>Museum for the history of sciences </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    5. 5. Metadata <ul><li>Data describing data </li></ul><ul><li>Digital content </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Resolution Dpi Date/Time created Creator Camera used File format (JPG, BMP, GIF, PNG, …) Location shot (GPS) Copyright Title Genre Rating Comment Depicted event Keywords/tags …
    6. 6. Metadata: tags <ul><li>Tag </li></ul><ul><ul><li>“ Brussels”, “City hall”, “Building”, “nice”, … </li></ul></ul><ul><ul><li>Free text annotation </li></ul></ul><ul><ul><li>Keywords, terms, comments </li></ul></ul><ul><ul><li>Informally </li></ul></ul><ul><ul><li>Personally </li></ul></ul><ul><ul><li>Started as taxonomies or vocabularies used to describe content </li></ul></ul><ul><ul><li>Evolved into folksonomies </li></ul></ul><ul><ul><ul><li>User-driven </li></ul></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    7. 7. Taxonomies <ul><li>Example </li></ul><ul><ul><li>Dewey Decimal Classification </li></ul></ul><ul><ul><li>Library classification </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    8. 8. Taxonomies <ul><li>Top down </li></ul><ul><li>Pre-defined structure </li></ul><ul><li>Hierarchy </li></ul><ul><li>Controlled vocabularies </li></ul><ul><li>Expert </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    9. 9. Folksonomy <ul><li>Folk + taxonomy </li></ul><ul><ul><li>Free form text annotation </li></ul></ul><ul><ul><li>No predefined structure </li></ul></ul><ul><ul><li>No hierarchy </li></ul></ul><ul><ul><li>Users add metadata </li></ul></ul><ul><ul><li>Flat name space </li></ul></ul><ul><ul><li>Bottom up </li></ul></ul><ul><li>Two types: </li></ul><ul><ul><li>Broad </li></ul></ul><ul><ul><li>Narrow </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    10. 10. Broad Folksonomy <ul><li>Tagging shared content </li></ul><ul><li>Anyone can participate </li></ul><ul><li>Examples </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    11. 11. Narrow Folksonomy <ul><li>Tagging your own content </li></ul><ul><li>Tagging friend’s content </li></ul><ul><ul><li>No consolidation </li></ul></ul><ul><ul><li>No emerging vocabularies </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    12. 12. Tagging usage <ul><li>Navigation </li></ul><ul><ul><li>Tag clouds </li></ul></ul><ul><ul><li>Organization </li></ul></ul><ul><ul><li>Hints </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    13. 13. Tagging howto? <ul><li>Totally free </li></ul><ul><li>Semi-structured </li></ul><ul><li>Hinted </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    14. 14. Tagging problems <ul><li>Cultural differences: Genghis Kahn, for some a hero, for others a criminal </li></ul><ul><li>Communities of users can give different meaning to tags: Movie vs. Film vs. Cinema </li></ul><ul><li>Language issues </li></ul><ul><li>Ambiguity (“apple”) </li></ul><ul><li>Misspelled tags (40% Flickr, 28% del.icio.us) </li></ul><ul><li>Different goals of tags </li></ul><ul><ul><li>Factual tags: what is it about, what it is: ‘image’, ‘article’, ‘blog’,… </li></ul></ul><ul><ul><li>Subjective tags: user’s opinion: ‘funny’, ‘hot’, ‘stupid’,… </li></ul></ul><ul><ul><li>Personal tags: self reference: ‘toread’, ‘mycomments’, … </li></ul></ul><ul><li>Semantic issues : tag: “nothing to do with Brussels” </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    15. 15. Metadata <ul><li>Data describing data </li></ul><ul><li>Digital content </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Resolution Dpi Date/Time created Creator Camera used File format (JPG, BMP, GIF, PNG, …) Location shot (GPS) Copyright Title Genre Rating Comment Depicted event Keywords/tags …
    16. 16. Use of Metadata <ul><li>Understanding of multimedia content </li></ul><ul><li>Sharing </li></ul><ul><li>Management </li></ul><ul><li>Retrieval </li></ul><ul><ul><li>Search </li></ul></ul><ul><ul><li>browse </li></ul></ul><ul><li>Processing </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    17. 17. <ul><li>Compression and container formats </li></ul><ul><li>Standards for multimedia </li></ul><ul><ul><li>Standards for metadata? </li></ul></ul>Multimedia MP2 JPEG MPEG-2 MXF JPEG2000 AVI AAC H.264/MPEG-4 AVC PNG Motion JPEG2000 TIFF MP4 MPEG WAV FLAC VC-1 Ogg Vorbis DivX AIFF GIF JPEG-LS Matroska OGM/OGG Windows Media Audio DIRAC 3GP DV FLV Betacam Realmedia MOV AC-3/Dolby Digital Theora ASF TTA Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Video compression Audio compression Image compression Physical Containers
    18. 18. <ul><li>Standard which determines the structure of metadata </li></ul><ul><ul><li>Typically using XML </li></ul></ul>Metadata Standard Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”> <namePart>Claus, Hugo</namePart> <namePart type=“date”>1929-</namePart> <role> <text>creator</text> </role> </name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS Metadata Object Description Schema Resolution Dpi Date/Time created Creator Camera used File format (JPG, BMP, GIF, PNG, …) Location shot (GPS) Copyright Title Genre Rating Comment Keywords Depicted event …
    19. 19. XML <ul><li>XML (Extensible Markup Language) </li></ul><ul><ul><li>Standardized by W3C (World Wide Web Consortium) </li></ul></ul><ul><ul><li>Language to define the structure of a document </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <? xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?> <!– This is a list of books. --> <booklist> <book category=&quot;thriller&quot;> <title>Het Bernini Mysterie</title> <author>Dan Brown</author> </book> <book category=&quot;woordenboek&quot;> <title>Van Dale Frans-Nederlands</title> <author /> </book> </booklist> <ul><li>XML element </li></ul><ul><li>Attribute </li></ul><ul><li>Values </li></ul>
    20. 20. XML Schema <ul><li>XML Schema </li></ul><ul><ul><li>Uses XML to denote the structure of a document </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <ul><li>XML schema </li></ul><ul><li>Elements: </li></ul><ul><ul><ul><li>booklist </li></ul></ul></ul><ul><ul><ul><li>book </li></ul></ul></ul><ul><ul><ul><li>title </li></ul></ul></ul><ul><ul><ul><li>author </li></ul></ul></ul><ul><li>Order </li></ul><ul><li>Types (of values) </li></ul>Determines <? xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?> <!-- Dit is een boekenlijst. --> <booklist> <book category=&quot; thriller &quot;> <title> Het Bernini Mysterie </title> <author> Dan Brown </author> </book> <book category=&quot; woordenboek &quot;> <title> Van Dale Frans-Nederlands </title> <author /> </book> </booklist>
    21. 21. Metadata Standard <ul><li>Describe structure of metadata using XML schema </li></ul><ul><li>Textual specification, explains semantics of the elements </li></ul><ul><ul><li>titleInfo : “A word, phrase, character, or group of characters, normally appearing in a resource, that names it or the work contained in it. “ </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS XML schema Determines
    22. 22. <ul><li>Shared information uses common structure </li></ul><ul><li>Standard software can be used to parse information </li></ul>Use of Metadata Standards Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 DB DB Speak same language <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS document <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS document <?xml version=“1.0” encoding=“UTF-8” ?> <mods xmlns= http://www.loc.gov/mods/ … <titleInfo> <title>De geruchten</title> </titleInfo> <name type=“personal”>…</name> <typeOfResource>text</typeOfResource> <originInfo>… </originInfo> ... </mods> MODS document
    23. 23. Metadata Standards <ul><li>Different Metadata Standards exist! </li></ul><ul><ul><li>Different domains </li></ul></ul><ul><ul><li>Different communities </li></ul></ul><ul><ul><li>Different formats </li></ul></ul><ul><ul><li>Different focus </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    24. 24. Problems Metadata Standards <ul><ul><li>Different Metadata Standards exist! </li></ul></ul><ul><ul><li>Different Metadata standards can describe same thing </li></ul></ul><ul><ul><ul><li>But in different way!!! </li></ul></ul></ul>Detection and Representation of Moving Objects for Video Surveillance Chris Poppe Ghent, Belgium – June 9 2009 < object id=“ 0 ” > < box xc=“ 77 ” yc=“ 73 ” w=“ 21 ” h=“ 16 ” /> </ object > Box: “Coordinates of the centre and the dimensions of the bounding box of a detected object in pixels.” metadata example 1 CVML (Computer Vision Markup Language) < LLID =“ LLID1 ” >< Mask > < BB mp7:dim = “ 4 ” > 67 65 88 91 </ BB > </ Mask > </ LLID > BB: “Coordinates of a rectangular segment.” metadata example 2 VS7 (Video Surveillance Schema)
    25. 25. Problems Metadata Standard <ul><li>Current metadata standards define structure of metadata </li></ul><ul><li>Mappings are needed to use different standards within one system </li></ul><ul><li>Metadata standard does not solve everything! </li></ul><ul><ul><li>For instance: DC creator property </li></ul></ul><ul><ul><ul><li>Creator=“Shakespeare, William” </li></ul></ul></ul><ul><ul><ul><li>Creator=“William Shakespeare” </li></ul></ul></ul><ul><ul><ul><li>Creator=“Shakespeare” </li></ul></ul></ul><ul><ul><ul><li>Creator=“W. Shakespare” </li></ul></ul></ul><ul><ul><li>Same problems as tagging can occur </li></ul></ul><ul><li>Lack of ways to describe semantics of metadata </li></ul><ul><ul><li>Currently plain text </li></ul></ul><ul><ul><li>Not machine readable </li></ul></ul><ul><li>Multimedia content shifts to online repositories </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    26. 26. Semantic Web Technologies <ul><li>Technologies developed by the World Wide Web Consortium (W3C) </li></ul><ul><li>Vision: the Web as universal medium for data, information and knowledge exchange </li></ul><ul><li>HTML, XML -> RDF, RDFS, OWL, … </li></ul><ul><li>Focus is on semantics </li></ul><ul><li>Technologies to interconnect, exchange information </li></ul><ul><ul><li>Applicable for metadata also! </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    27. 27. Semantic Web ?.0 Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    28. 28. The Syntactic Web <ul><li>Consider a typical web page: </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <ul><li>Mark-up consists of: </li></ul><ul><ul><li>rendering information (e.g., font size and colour) </li></ul></ul><ul><ul><li>Hyper-links to related content </li></ul></ul><ul><li>Semantic content is accessible to humans but not (easily) to computers… </li></ul>
    29. 29. Impossible (?) using the Syntactic Web… <ul><li>Complex queries involving background knowledge </li></ul><ul><ul><li>Give me the telephone number of the responsible person within Multimedia Lab for the demo about metadata applications </li></ul></ul><ul><li>Locating information in data repositories </li></ul><ul><ul><li>Travel enquiries </li></ul></ul><ul><ul><li>Prices of goods and services </li></ul></ul><ul><ul><li>Results of human genome experiments </li></ul></ul><ul><li>Finding and using “ web services ” </li></ul><ul><ul><li>Visualize surface interactions between two proteins </li></ul></ul><ul><li>Delegating complex tasks to web “ agents ” </li></ul><ul><ul><li>Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    30. 30. Why is XML not enough <ul><li>http://www.w3.org/DesignIssues/RDF-XML.html (Tim Berners-lee) </li></ul><ul><li>Try to express “The author of the note is Tim” in XML </li></ul><ul><li>For a person, the three representations means the same, but NOT for a machine! </li></ul><ul><ul><li>XML contains structures only, no semantics </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 <author> <uri>note</uri> <name>Tim</name> </author> <document href= &quot; note&quot;> <author>Tim</author> </document> <document uri=&quot;note&quot; author=&quot;Tim&quot; />
    31. 31. RDF <ul><li>RDF (Resource Description Framework) </li></ul><ul><li>Triples: subject – predicate – object </li></ul><ul><li>URI to identify resources </li></ul><ul><li>“ The author of the note is Tim” </li></ul><ul><li>Serialization in XML: </li></ul><ul><li><rdf:RDF xmlns:rdf= http://www.w3.org/1999/02/22-rdf-syntax-ns# > </li></ul><ul><li> <Note rdf:about= http://www.example.org/#note > </li></ul><ul><li><hasAuthor rdf:resource=&quot;http://www.example.org/#Tim”/> </li></ul><ul><li></Note> </li></ul><ul><li></rdf:RDF> </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Note Tim hasAuthor
    32. 32. RDFS <ul><li>RDF Schema </li></ul><ul><li>Standardized vocabulary for describing concepts </li></ul><ul><li>Introduces classes and instances </li></ul><ul><li>Subclasses, sub properties </li></ul><ul><ul><li>Possible to define hierarchies! </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Note1 Tim hasAuthor ClassNote Class Person type type
    33. 33. OWL <ul><li>Web Ontology Language, W3C recommendation (2004) </li></ul><ul><li>Provides richer vocabulary </li></ul><ul><li>Define advanced relations </li></ul><ul><ul><li>Data typing </li></ul></ul><ul><ul><li>Cardinalities </li></ul></ul><ul><ul><li>Rich typing of properties </li></ul></ul><ul><ul><li>… </li></ul></ul><ul><li>Example: </li></ul><ul><li>Allows for intelligent reasoning </li></ul><ul><li>Complex ontologies can be created </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 Note1 Tim hasAuthor ClassNote Class Person type type isAuthorFrom <owl:ObjectProperty rdf:ID=“ isAuthorFrom ”> <owl:inverseOf rdf:resource=“#hasAuthor”> </owl:ObjectProperty>
    34. 34. Ontology <ul><li>Information in a domain is structured using an ontology </li></ul><ul><ul><ul><li>a data model that represents a set of concepts and relations amongst these concepts within a specific domain </li></ul></ul></ul><ul><li>Thesaurus </li></ul><ul><ul><li>Dictionary </li></ul></ul><ul><ul><ul><li>Synonyms </li></ul></ul></ul><ul><li>Taxonomy </li></ul><ul><ul><li>Hierarchy </li></ul></ul><ul><ul><ul><li>Subclass and siblings </li></ul></ul></ul><ul><li>Ontology </li></ul><ul><ul><li>concepts </li></ul></ul><ul><ul><li>relations </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    35. 35. Ontology (using OWL) <ul><li>Example: ontology for domain of science </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009 subClassOf birth date DatatypeProperty Person Class: Person Class: Scientist Scientist Individual birth date “ 14/10/1801” <ul><li>OWL constructs </li></ul><ul><li>Class </li></ul><ul><li>DatatypeProperty </li></ul><ul><li>subClassOf </li></ul><ul><li>Individual </li></ul><ul><li>… </li></ul>“ Joseph Plateau”
    36. 36. Semantic Web Technologies <ul><li>SPARQL Protocol And RDF Query Language (SPARQL) </li></ul><ul><ul><li>SQL-like language for RDF </li></ul></ul><ul><ul><li>Example: search for all the notes of Tim </li></ul></ul><ul><ul><ul><li>SELECT ?x WHERE ?x hasAuthor Tim </li></ul></ul></ul><ul><li>Rule Interchange Language (RIF) </li></ul><ul><ul><li>Example rule: if Tim is the author of the note, he is also a contributor </li></ul></ul><ul><ul><li>goal is to create an interchange format for different rule languages and inference engines </li></ul></ul><ul><ul><li>closely related to ontologies </li></ul></ul><ul><ul><ul><li>rules combine information and derive new information on top of ontologies </li></ul></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    37. 37. Semantic Web Technologies <ul><li>Data on the web can be linked to each other </li></ul><ul><ul><li>Example: ontology on Brussels </li></ul></ul><ul><ul><ul><li>DBpedia.org </li></ul></ul></ul><ul><ul><li>Browsing: </li></ul></ul><ul><ul><ul><li>Brussels ->cityofbirth -> Raymon_Goethals -> managerclubs -> RSC Anderlecht … </li></ul></ul></ul><ul><ul><li>Querying: find all people born in Brussels before 1930 </li></ul></ul><ul><ul><li>Reasoning: if a person was born in Brussels, he was also born in Belgium </li></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    38. 38. Semantic Web Technologies Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    39. 39. Semantic Web ?.0 Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    40. 40. Conclusions <ul><li>Use metadata standards! </li></ul><ul><ul><li>Allows interchange </li></ul></ul><ul><ul><li>Structures the metadata </li></ul></ul><ul><li>When no standard is sufficient </li></ul><ul><ul><li>Apply proprietary format </li></ul></ul><ul><ul><li>Structures the metadata </li></ul></ul><ul><li>If tagging is needed for search/browsing/retrieval </li></ul><ul><ul><li>Provide fixed structure </li></ul></ul><ul><ul><ul><li>E.g., who, what, where, when, … </li></ul></ul></ul><ul><ul><li>Provide fixed vocabulary </li></ul></ul><ul><ul><ul><li>Thesaurus </li></ul></ul></ul><ul><ul><ul><li>Hierarchy </li></ul></ul></ul><ul><ul><ul><li>Ontology for advanced reasoning </li></ul></ul></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009
    41. 41. <ul><li>Questions? </li></ul>Metadata Chris Poppe Levend Geheugen, Brussels, Belgium – September 29 2009

    ×