A Semantic Multimedia Web (Part 1)

3,023 views
2,882 views

Published on

A Semantic Multimedia Web: Part 1 of a series of lectures given at the Universidad Politécnica de Madrid (UPM), May 2009

Published in: Technology, Education
1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total views
3,023
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
Downloads
11
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide
  • A Semantic Multimedia Web (Part 1)

    1. 1. A Semantic Multimedia Web : Create, Annotate, Present and Share your Media Raphaël Troncy, < [email_address] > CWI, Interactive Information Access
    2. 2. #me <ul><li>Academic </li></ul><ul><ul><li>PhD (2000-2004) </li></ul></ul><ul><ul><li>PostDoc (2004-2005) </li></ul></ul><ul><ul><li>Researcher (2005-2009) </li></ul></ul><ul><ul><li>Lecturer (2009 - ) </li></ul></ul><ul><li>Projects / Standardizations </li></ul><ul><ul><li>K-Space </li></ul></ul><ul><ul><li>W3C </li></ul></ul><ul><ul><li>IPTC </li></ul></ul><ul><li>(Semantic) Web identity </li></ul>Media Fragments Media Annotations
    3. 3. Learning Objectives <ul><li>Understand multimedia applications workflow </li></ul><ul><ul><li>Take the canonical processes of media production model </li></ul></ul><ul><li>Explore various multimedia metadata formats </li></ul><ul><ul><li>Be aware of the advantages and limitations of various models </li></ul></ul><ul><ul><li>Know the interoperability issues and understand COMM, a Core Ontology for Multimedia </li></ul></ul><ul><li>Learn about upcoming W3C recommendations that will enable fine-grained access to video on the web </li></ul>
    4. 5. Motivation <ul><li>Growing amount of digital multimedia content available on the Web </li></ul><ul><li>Content difficult to search and reuse </li></ul><ul><ul><li>Barely invisible for the search engines </li></ul></ul>
    5. 6. Image search (Yahoo!)
    6. 7. Image search (Google fr)
    7. 8. Image search (Google en)
    8. 9. Video search
    9. 10. Image/Video indexing <ul><li>Techniques used by current search engines </li></ul><ul><ul><li>search term occurs in the filename or in the caption </li></ul></ul><ul><ul><li>no semantics </li></ul></ul><ul><li>Image indexing: main problem </li></ul><ul><ul><li>an image is not alphabetic: there is no countable discrete units, that, in combination will provide the meaning of the image </li></ul></ul><ul><ul><li>image descriptors are not given with the image: one needs to extract or interpret them </li></ul></ul><ul><li>Video indexing: additional problem </li></ul><ul><ul><li>a video has additionally a temporal dimension to take into account </li></ul></ul><ul><ul><li>a video has a priori no discrete units neither (i.e. frames, shots, sequences cannot be absolutely defined) </li></ul></ul>
    10. 11. Why is it so difficult to find appropriate multimedia content, to reuse and repurpose content previously published and to adapt interfaces to these content according to different user needs?
    11. 12. Agenda <ul><li>Understanding Multimedia Applications Workflow </li></ul><ul><ul><li>CeWe Color Photo Book creation application </li></ul></ul><ul><ul><li>Vox Populi argumentation-based video sequences generation </li></ul></ul><ul><ul><li>Canonical Processes of Media Production </li></ul></ul><ul><li>Semantic Annotation of Multimedia Content </li></ul><ul><ul><li>MPEG-7 based ontologies </li></ul></ul><ul><ul><li>COMM: A Core Ontology for MultiMedia </li></ul></ul><ul><li>Media Fragments URIs </li></ul><ul><ul><li>Support the addressing and retrieval of subparts of time-based Web resources </li></ul></ul><ul><ul><li>Share clips, retrieve excerpts, annotate parts of media </li></ul></ul>
    12. 13. Understanding Multimedia Applications Workflow <ul><li>Identify and define a number of canonical processes of media production </li></ul><ul><li>Community effort </li></ul><ul><ul><li>2005: Dagstuhl seminar </li></ul></ul><ul><ul><li>2005: ACM MM Workshop on Multimedia for Human Communication </li></ul></ul><ul><ul><li>2008: Multimedia Systems Journal Special Issue (core model and companion system papers) editors: Frank Nack, Zeljko Obrenovic and Lynda Hardman </li></ul></ul>
    13. 14. Overview of Canonical Processes
    14. 15. Example 1: CeWe Color PhotoBook <ul><li>Application for authoring digital photo books </li></ul><ul><li>Automatic selection, sorting and ordering of photos </li></ul><ul><ul><li>Context analysis methods: timestamp, annotation, etc. </li></ul></ul><ul><ul><li>Content analysis methods: color histograms, edge detection, etc. </li></ul></ul><ul><li>Customized layout and background </li></ul><ul><li>Print by the European leader photo finisher company </li></ul>http://www.cewe-photobook.com
    15. 16. CeWe Color PhotoBook Processes <ul><li>My winter ski holidays with my friends </li></ul>
    16. 17. CeWe Color PhotoBook Processes
    17. 18. CeWe Color PhotoBook Processes
    18. 19. CeWe Color PhotoBook Processes
    19. 20. CeWe Color PhotoBook Processes
    20. 21. CeWe Color PhotoBook Processes
    21. 22. Example 2: Vox Populi Video Sequences Generation <ul><li>Stefano Bocconi, Frank Nack </li></ul><ul><li>Interview with America video footage with interviews and background material about the opinion of American people after 9-11 http://www.interviewwithamerica.com </li></ul><ul><li>Example question: What do you think of the war in Afghanistan? </li></ul>“ I am never a fan of military action, in the big picture I don’t think it is ever a good thing, but I think there are circumstances in which I certainly can’t think of a more effective way to counter this sort of thing…”
    22. 23. Vox Populi Premeditate Process <ul><li>Analogous to the pre-production process in the film industry </li></ul><ul><ul><li>Static versus dynamic video artifact </li></ul></ul><ul><li>Output </li></ul><ul><ul><li>Script, planning of the videos to be captured </li></ul></ul><ul><ul><li>Questions to the interviewee prepared </li></ul></ul><ul><ul><li>Profiles of the people interviewed: education, age, gender, race </li></ul></ul><ul><ul><li>Locations where the interviews take place </li></ul></ul>
    23. 24. Vox Populi Annotations <ul><li>Contextual </li></ul><ul><ul><li>Interviewee (social), locations </li></ul></ul><ul><li>Descriptive </li></ul><ul><ul><li>Question asked and transcription of the answers </li></ul></ul><ul><ul><li>Filmic continuity, examples: </li></ul></ul><ul><ul><ul><li>gaze direction of speaker (left, centre, right) </li></ul></ul></ul><ul><ul><ul><li>framing (close-up, medium shot, long shot) </li></ul></ul></ul><ul><li>Rhetorical </li></ul><ul><ul><li>Rhetorical Statement </li></ul></ul><ul><ul><li>Argumentation model: Toulmin model </li></ul></ul>
    24. 25. Vox Populi Statement Annotations <ul><li>Statement formally annotated: </li></ul><ul><ul><li>< subject > <modifier> <predicate> </li></ul></ul><ul><ul><li>E.g. “ war best solution ” </li></ul></ul><ul><li>A thesaurus containing: </li></ul><ul><ul><li>Terms on the topics discussed (155) </li></ul></ul><ul><ul><li>Relations between terms: similar (72), opposite (108), generalization (10), specialization (10) </li></ul></ul><ul><ul><li>E.g. war opposite diplomacy </li></ul></ul>
    25. 26. Toulmin Model 57 Claims, 16 Data, 4 Concessions, 3 Warrants, 1 Condition Claim Data Qualifier Warrant Backing Condition Concession
    26. 27. Vox Populi Query Interface
    27. 28. Vox Populi Organize Process <ul><li>Using the thesaurus, create a graph of related statements </li></ul><ul><ul><li>nodes are the statements (corresponding to video segments) “ war best solution ”, “ diplomacy best solution” , “ war not solution” </li></ul></ul><ul><ul><li>edges are either support or contradict </li></ul></ul>support contradict war best solution war not solution diplomacy best solution
    28. 29. Result of Vox Populi Query I am not a fan of military actions War has never solved anything I cannot think of a more effective solution Two billions dollar bombs on tents
    29. 30. Vox Populi Processes
    30. 31. Canonical Processes 101 <ul><li>Canonical: reduced to the simplest and most significant form possible without loss of generality </li></ul><ul><li>Each process </li></ul><ul><ul><li>short description </li></ul></ul><ul><ul><li>illustrated with use cases </li></ul></ul><ul><ul><li>input(s), actor(s) & output(s) </li></ul></ul><ul><li>Formalization of processes in UML diagrams in paper (see literature list) </li></ul>
    31. 32. Premeditate <ul><li>Establish initial ideas about media production </li></ul><ul><ul><li>Design a photo book of my last holidays for my family </li></ul></ul><ul><ul><li>Create argument-based sequences of videos of interviews after September 11 </li></ul></ul><ul><li>Inputs: ideas, inspirations from human experience </li></ul><ul><li>Actors: </li></ul><ul><ul><li>camera owner </li></ul></ul><ul><ul><li>group of friends </li></ul></ul><ul><li>Outputs: </li></ul><ul><ul><li>decision to take camera onto ski-slope </li></ul></ul><ul><ul><li>structured set of questions and locations for interviews </li></ul></ul>
    32. 33. Create Media Asset <ul><li>Media assets are captured, generated or transformed </li></ul><ul><ul><li>Photos taken at unspecified moments at holiday locations </li></ul></ul><ul><ul><li>Synchronized audio video of interviewees responding to fixed questions at many locations </li></ul></ul><ul><li>Inputs: </li></ul><ul><ul><li>decision to take camera onto ski-slope; </li></ul></ul><ul><ul><li>structured set of questions and locations for interviews </li></ul></ul><ul><li>Actors: </li></ul><ul><ul><li>(video) camera, editing suite </li></ul></ul><ul><li>Outputs: </li></ul><ul><ul><li>images, videos </li></ul></ul>
    33. 34. Annotate <ul><li>Annotation is associated with asset </li></ul><ul><li>Inputs: </li></ul><ul><ul><li>photo, video, existing annotation </li></ul></ul><ul><ul><li>optional thesaurus of terms </li></ul></ul><ul><li>Actors: </li></ul><ul><ul><li>human, feature analysis program </li></ul></ul><ul><li>Outputs: </li></ul><ul><ul><li>Complex structure associating annotations with images, videos </li></ul></ul>Q: “What do you think of the Afghanistan war?” Speaker: Female, Caucasian…
    34. 35. Semantic Annotate <ul><li>Annotation uses existing controlled vocabularies </li></ul><ul><ul><li>Subject matter annotations of your photos (COMM, XMP) </li></ul></ul><ul><ul><li>Rhetorical annotations in Vox Populi </li></ul></ul>solution best war Predicate thesaurus Modifier thesaurus Subject
    35. 36. Package <ul><li>Process artifacts are packed logically or physically </li></ul><ul><li>Useful for storing collections of media after capturing… </li></ul><ul><li>… before selecting subset for further stages </li></ul>
    36. 37. Query <ul><li>User retrieves a set of process artifacts based on a user-specified query </li></ul><ul><li>Inputs: </li></ul><ul><ul><li>user query, in terms of annotations or by example </li></ul></ul><ul><ul><li>collection(s) of assets </li></ul></ul><ul><li>Actors: </li></ul><ul><ul><li>human </li></ul></ul><ul><li>Output: </li></ul><ul><ul><li>subset of assets plus annotations (in no order) </li></ul></ul>
    37. 38. Construct Message <ul><li>Author specifies the message they wish to convey </li></ul><ul><ul><li>Our holiday was sporty, great weather and fun </li></ul></ul><ul><ul><li>Create clash about whether war is a good thing </li></ul></ul><ul><li>Inputs: ideas, decisions, available assets </li></ul><ul><li>Actors: </li></ul><ul><ul><li>author </li></ul></ul><ul><li>Outputs: </li></ul><ul><ul><li>the message that should be conveyed by the assets </li></ul></ul>
    38. 39. Organize <ul><li>Process where process artifacts are organized according to the message </li></ul><ul><ul><li>Organize a number of 2-page layouts in photobook </li></ul></ul><ul><ul><li>Use semantic graph to select related video clips to form linear presentation of parts of argument structure </li></ul></ul><ul><li>Inputs: set of assets and annotations (e.g. output from query process) </li></ul><ul><li>Actors: human or machine </li></ul><ul><li>Outputs: document structure with recommended groupings and orderings for assets </li></ul>
    39. 40. Publish <ul><li>Presentation is created </li></ul><ul><ul><li>associated annotations may be removed </li></ul></ul><ul><ul><li>create proprietary format of photobook for upload </li></ul></ul><ul><ul><li>create SMIL file containing videos and timing information </li></ul></ul><ul><li>Inputs: set of assets and annotations (e.g. output from organize process) </li></ul><ul><li>Actors: human or machine </li></ul><ul><li>Outputs: </li></ul><ul><ul><li>final presentation in specific document format, such as html, smil or pdf </li></ul></ul>
    40. 41. Distribute <ul><li>Presentation is transported to end user, end-user can view and interact with it </li></ul><ul><ul><li>photobook uploaded to printer, printed then posted to user </li></ul></ul><ul><ul><li>SMIL file is downloaded to client and played </li></ul></ul><ul><li>Inputs: published document (output from publish process) </li></ul><ul><li>Actors: distribution hardware and software </li></ul><ul><li>Outputs: </li></ul><ul><ul><li>media assets presented on user’s device </li></ul></ul>
    41. 42. Canonical Processes Possible Flow
    42. 43. Sum Up <ul><li>Community agreement, not “yet another model” </li></ul><ul><li>Large proportion of the functionality provided by multimedia applications can be described in terms of this model </li></ul><ul><li>Initial step towards the definition of open web-based data structures for describing and sharing semantically annotated media assets </li></ul>
    43. 44. Discussion <ul><li>Frequently asked questions </li></ul><ul><ul><li>Complex processes </li></ul></ul><ul><ul><li>Interaction </li></ul></ul><ul><ul><li>Complex artifacts and annotations can be annotated </li></ul></ul><ul><li>Towards a more rigorous formalization of model </li></ul><ul><ul><li>Relationship to foundational ontologies </li></ul></ul><ul><ul><li>Semantics of Annotations </li></ul></ul>
    44. 45. Literature <ul><li>Lynda Hardman: Canonical Processes of Media Production . In Proceedings of the ACM Workshop on Multimedia for Human Communication - From Capture to Convey (MHC 05) , November 2005. </li></ul><ul><li>Special Issue on Canonical Processes of Media Production http://www.springerlink.com/content/j0l4g337581652t1/ http:// www.cwi.nl /~media/projects/canonical/ </li></ul><ul><li>Lynda Hardman, Zeljko Obrenovic, Frank Nack, Brigitte Kerhervé and Kurt Piersol: Canonical Processes of Semantically Annotated Media Production . In Multimedia Systems Journal , 14 (6), p327-340, 2008 </li></ul><ul><li>Philipp Sandhaus, Sabine Thieme and Susanne Boll: Canonical Processes in Photo Book Production . In Multimedia Systems Journal , 14 (6), p351-357, 2008 </li></ul><ul><li>Stefano Bocconi, Frank Nack and Lynda Hardman: Automatic generation of matter-of-opinion video documentaries . In Journal of Web Semantics , 6 (2), p139-150, 2008. </li></ul>
    45. 46. Agenda <ul><li>Understanding Multimedia Applications Workflow </li></ul><ul><ul><li>CeWe Color Photo Book creation application </li></ul></ul><ul><ul><li>Vox Populi argumentative video sequences generation system </li></ul></ul><ul><ul><li>The Canonical Processes of Media Production </li></ul></ul><ul><li>Semantic Annotation of Multimedia Content </li></ul><ul><ul><li>MPEG-7 based ontologies </li></ul></ul><ul><ul><li>COMM: A Core Ontology for MultiMedia </li></ul></ul><ul><li>Media Fragment URIs </li></ul><ul><ul><li>Support the addressing addressing and retrieval of subparts of time-based Web resources </li></ul></ul><ul><ul><li>Share clips, retrieve excerpts, annotate parts of media </li></ul></ul>
    46. 47. Wake Up Video
    47. 48. The Importance of the Annotations
    48. 49. Multimedia: Description methods MPEG-1 MPEG-2 MPEG-4 MPEG-7 MPEG-21 ISO W3C
    49. 50. MPEG-7: a multimedia description language? <ul><li>ISO standard since December of 2001 </li></ul><ul><li>Main components : </li></ul><ul><ul><li>Descriptors (Ds) and Description Schemes (DSs) </li></ul></ul><ul><ul><li>DDL (XML Schema + extensions) </li></ul></ul><ul><li>Concern all types of media </li></ul>Part 5 – MDS Multimedia Description Schemes
    50. 51. MPEG-7 and the Semantic Web <ul><li>MDS Upper Layer represented in RDFS </li></ul><ul><ul><li>2001: Hunter </li></ul></ul><ul><ul><li>Later on: link to the ABC upper ontology </li></ul></ul><ul><li>MDS fully represented in OWL-DL </li></ul><ul><ul><li>2004: Tsinaraki et al., DS-MIRF model </li></ul></ul><ul><li>MPEG-7 fully represented in OWL-DL </li></ul><ul><ul><li>2005: Garcia and Celma, Rhizomik model </li></ul></ul><ul><ul><li>Fully automatic translation of the whole standard </li></ul></ul><ul><li>MDS and Visual parts represented in OWL-DL </li></ul><ul><ul><li>2007: Arndt et al., COMM model </li></ul></ul><ul><ul><li>Re-engineering MPEG-7 using DOLCE design patterns </li></ul></ul>
    51. 52. Requirements [aceMedia, MMSEM XG] <ul><li>MPEG-7 compliance </li></ul><ul><ul><li>Support most descriptors (decomposition, visual, audio) </li></ul></ul><ul><li>Syntactic and Semantic interoperability </li></ul><ul><ul><li>Shared and formal semantics represented in a Web language (OWL, RDF/XML, RDFa, etc.) </li></ul></ul><ul><li>Separation of concerns </li></ul><ul><ul><li>Domain knowledge versus multimedia specific information </li></ul></ul><ul><li>Modularity </li></ul><ul><ul><li>Enable customization of multimedia ontology </li></ul></ul><ul><li>Extensibility </li></ul><ul><ul><li>Enable inclusion of further descriptors (non MPEG-7) </li></ul></ul>
    52. 53. MPEG-7 Based Ontologies MM Analysis Digital Rights Digital Libraries Digital Libraries Applications MDS+Visual All MDS+CS MDS+Visual Coverage OWL-DL OWL-DL OWL-DL OWL-Full Complexity DOLCE None None ABC Foundational Ontologies COMM Rhizomik DS-MIRF Hunter
    53. 54. Common Scenario The &quot; Big Three &quot; at the Yalta Conference (Wikipedia)
    54. 55. Common Scenario: Tagging Approach The &quot; Big Three &quot; at the Yalta Conference (Wikipedia) <ul><li>Localize a region </li></ul><ul><ul><li>Draw a bounding box, a circle around a shape </li></ul></ul><ul><li>Annotate the content </li></ul><ul><ul><li>Interpret the content </li></ul></ul><ul><ul><li>Tag: Winston Churchill, UK Prime Minister, Allied Forces, WWII </li></ul></ul>Reg1
    55. 56. Common Scenario: SW Approach The &quot; Big Three &quot; at the Yalta Conference (Wikipedia) <ul><li>Localize a region </li></ul><ul><ul><li>Draw a bounding box, a circle around a shape </li></ul></ul><ul><li>Annotate the content </li></ul><ul><ul><li>Interpret the content </li></ul></ul><ul><ul><li>Link to knowledge on the Web </li></ul></ul><ul><ul><li>:Reg1 foaf:depicts dbpedia:Winston_Churchill </li></ul></ul><ul><ul><li>dbpedia:Winston_Churchill skos:altLabel &quot;Sir Winston Leonard Spencer-Churchill&quot; </li></ul></ul><ul><ul><li>dbpedia:Winston_Churchill rdf:type foaf:Person </li></ul></ul>Reg1
    56. 57. Hunter's MPEG-7 Ontology
    57. 58. DS-MIRF MPEG-7 Ontology
    58. 59. Rhizomik MPEG-7 Ontology
    59. 60. COMM: Fragment Identification
    60. 61. Comparison <ul><li>Link with domain semantics </li></ul><ul><ul><li>Hunter: ABC model + mpeg7:depicts relationship </li></ul></ul><ul><ul><li>DS-MIRF: Domain ontologies needs to subclass the general MPEG-7 categories </li></ul></ul><ul><ul><li>Rhizomik: Use the mpeg7:semantic relationship </li></ul></ul><ul><ul><li>COMM: Semantic Annotation pattern </li></ul></ul><ul><li>MPEG-7 coverage </li></ul><ul><ul><li>Hunter: extension of the MPEG-7 visual descriptors </li></ul></ul><ul><ul><li>COMM: </li></ul></ul><ul><ul><ul><li>Formalization of the context of the annotation </li></ul></ul></ul><ul><ul><ul><li>Representation of the method (algorithm) that provides the annotation </li></ul></ul></ul>
    61. 62. Comparison <ul><li>Modeling Decisions: </li></ul><ul><ul><li>DS-MIRF and Rhizomik: 1-to-1 translation from MPEG-7 to OWL/RDF </li></ul></ul><ul><ul><li>Hunter: Simplification and link to the ABC upper model </li></ul></ul><ul><ul><li>COMM: NO 1-to-1 translation </li></ul></ul><ul><ul><ul><li>Need for patterns: use DOLCE, a well designed foundational ontology as a modeling basis </li></ul></ul></ul><ul><li>Scalability: </li></ul>19 20 27 11 Triples COMM Rhizomik DS-MIRF Hunter
    62. 64. Scenario: Image The &quot; Big Three &quot; at the Yalta Conference (Wikipedia) <ul><li>Localize a region (bounding box) </li></ul><ul><li>Annotate the content (interpretation) </li></ul><ul><ul><li>Tag: Winston Churchill, UK Prime Minister, Allied Forces, WWII </li></ul></ul><ul><ul><li>Link to knowledge on the Web </li></ul></ul>Reg1 <ul><ul><li>:Reg1 foaf:depicts dbpedia:Winston_Churchill </li></ul></ul><ul><ul><li>dbpedia:Winston_Churchill skos:altLabel &quot;Sir Winston Leonard Spencer-Churchill&quot; </li></ul></ul><ul><ul><li>dbpedia:Winston_Churchill rdf:type foaf:Person </li></ul></ul>
    63. 65. Scenario: Video A history of G8 violence ( video ) (© Reuters) <ul><li>Localize a region </li></ul><ul><li>Annotate the content </li></ul><ul><ul><li>Tag: G8 Summit, Heiligendamn, 2007 </li></ul></ul><ul><ul><li>Link to knowledge on the Web </li></ul></ul>EU Summit, Gothenburg, 2001 Seq1 Seq4 <ul><ul><li>:Seq1 foaf:depicts dbpedia:34th_G8_Summit </li></ul></ul><ul><ul><li>:Seq4 foaf:depicts dbpedia:EU_Summit </li></ul></ul><ul><ul><li>geo:Heilegendamn skos:broader geo:Germany </li></ul></ul>
    64. 66. Research Problem The &quot; Big Three &quot; at the Yalta Conference (Wikipedia) A history of G8 violence ( video ) (© Reuters) <ul><li>Multimedia objects are complex </li></ul><ul><ul><li>Compound information objects, fragment identification </li></ul></ul><ul><li>Semantic annotation </li></ul><ul><ul><li>Subjective interpretation, context dependent </li></ul></ul><ul><li>Linked data principle </li></ul><ul><ul><li>Open to reuse existing knowledge </li></ul></ul> MPEG-7  RDF  D&S | OIO Reg1 Seq1 Seq4
    65. 67. COMM: Design Rationale <ul><li>Approach: </li></ul><ul><ul><li>NO 1-to-1 translation from MPEG-7 to OWL/RDF </li></ul></ul><ul><ul><li>Need for patterns: use DOLCE, a well designed foundational ontology as a modeling basis </li></ul></ul><ul><li>Design patterns: </li></ul><ul><ul><li>Ontology of Information Objects (OIO) </li></ul></ul><ul><ul><ul><li>Formalization of information exchange </li></ul></ul></ul><ul><ul><ul><li>Multimedia = complex compound information objects </li></ul></ul></ul><ul><ul><li>Descriptions and Situations (D&S) </li></ul></ul><ul><ul><ul><li>Formalization of context </li></ul></ul></ul><ul><ul><ul><li>Multimedia = contextual interpretation (situation) </li></ul></ul></ul><ul><li>Define multimedia patterns that translate MPEG-7 in the DOLCE vocabulary </li></ul>
    66. 68. COMM: Core Functionalities <ul><li>Most important MPEG-7 functionalities: </li></ul><ul><ul><li>Decomposition of multimedia content into segments </li></ul></ul><ul><ul><li>Annotation of segments with metadata </li></ul></ul><ul><ul><ul><li>Administrative metadata: creation & production </li></ul></ul></ul><ul><ul><ul><li>Content-based metadata: audio/visual descriptors </li></ul></ul></ul><ul><ul><ul><li>Semantic metadata: interface with domain specific ontologies </li></ul></ul></ul> Note that all are subjective and context dependent situations
    67. 69. COMM: D&S / OIO Patterns <ul><li>Definition of design patterns for decomposition and annotation based on D&S and OIO </li></ul><ul><ul><li>MPEG-7 describes digital data ( multimedia information objects ) with digital data ( annotation ) </li></ul></ul><ul><ul><li>Digital data entities are information objects </li></ul></ul><ul><ul><li>Decompositions and annotations are situations that satisfy the rules of a method or algorithm </li></ul></ul>
    68. 70. COMM: Decomposition Pattern MPEG-7 MPEG-7
    69. 71. COMM: Annotation Pattern MPEG-7
    70. 72. COMM: Semantic Pattern Domain Ontologies
    71. 73. COMM: Modules Decomposition Pattern Annotation Pattern
    72. 74. Example 1: Fragment Identification
    73. 75. Example 1: Region Annotation
    74. 76. Example 2: Fragment Identification
    75. 77. Example 2: Sequence Annotation
    76. 78. Implementation <ul><li>COMM fully formalized in OWL DL </li></ul><ul><ul><li>Rich axiomatization, consistency check (Fact++v1.1.5) </li></ul></ul><ul><ul><li>OWL 2.0: qualified cardinality restrictions for number restrictions of MPEG-7 low-level descriptors </li></ul></ul><ul><li>JAVA API available </li></ul><ul><ul><li>MPEG-7 class interface for the construction of meta-data at runtime </li></ul></ul>
    77. 79. KAT Annotation Tool
    78. 80. Literature <ul><li>Richard Arndt, Raphaël Troncy, Steffen Staab, Lynda Hardman and Miroslav Vacura: COMM: Designing a Well-Founded Multimedia Ontology for the Web . In 6th International Semantic Web Conference (ISWC'2007) , Busan, Korea, November 11-15, 2007. </li></ul><ul><li>Raphaël Troncy, Oscar Celma, Suzanne Little, Roberto Garcia, Chrisa Tsinaraki: MPEG-7 based Multimedia Ontologies: Interoperability Support or Interoperability Issue? In 1st Workshop on Multimedia Annotation and Retrieval enabled by Shared Ontologies (MAReSO'2007) , Genoa, Italy, December 2007. </li></ul>
    79. 81. Agenda <ul><li>Understanding Multimedia Applications Workflow </li></ul><ul><ul><li>CeWe Color Photo Book creation application </li></ul></ul><ul><ul><li>Vox Populi argumentative video sequences generation system </li></ul></ul><ul><ul><li>The Canonical Processes of Media Production </li></ul></ul><ul><li>Semantic Annotation of Multimedia Content </li></ul><ul><ul><li>MPEG-7 based ontologies </li></ul></ul><ul><ul><li>COMM: A Core Ontology for MultiMedia </li></ul></ul><ul><li>Media Fragment URIs </li></ul><ul><ul><li>Support the addressing addressing and retrieval of subparts of time-based Web resources </li></ul></ul><ul><ul><li>Share clips, retrieve excerpts, annotate parts of media </li></ul></ul>
    80. 82. Where is the Triumph Arc? http://www.flickr.com/photos/40001315@N00/2705120551 /
    81. 83. Who is Saphira? http://www.flickr.com/photos/mhausenblas/2883727293/ <ul><li>Region-based annotation in Flickr ... but it cannot be taken out of Flickr </li></ul><ul><ul><li>the flickr note provides a URI for the region, but one needs to access the flickr API to access the region [ CaMiCatzee ] </li></ul></ul>
    82. 84. Sarkozy Laughing with Putin? http://www.youtube.com/watch?v=7fMCTo-GQ2A#t=34s
    83. 85. Clinton Laughing with Yeltsin? http://www.youtube.com/watch?v=sxoh1z6s_Cw#t=15s <ul><li>Temporal annotation in YouTube ... but the UA downloads the complete resource and seeks ... and the YouTube syntax is different from Google Video, Vimeo, DailyMotion, etc. </li></ul>
    84. 86. Problems <ul><li>How to address and retrieve parts of multimedia content? </li></ul><ul><ul><li>region of an image / sequence of a video </li></ul></ul><ul><li>How to describe parts of multimedia content in the Web of Data? </li></ul><ul><li>How can I make a better use of the general world knowledge? </li></ul>
    85. 87. 1. Media Fragments <ul><li>Every popular web site does it ... </li></ul><ul><ul><li>region-based annotation in Flickr </li></ul></ul><ul><ul><li>temporal sequence annotation in YouTube </li></ul></ul><ul><li>... BUT: </li></ul><ul><ul><li>region-based annotations cannot be exported </li></ul></ul><ul><ul><li>YouTube syntax is different than DailyMotion, Vimeo, etc. </li></ul></ul>#t=34s #t=15s
    86. 88. W3C Media Fragments WG W3C Media Fragments WG http://www.w3.org/2008/WebVideo/Fragments/
    87. 89. W3C Media Fragments WG <ul><li>Provide URI-based mechanisms for uniquely identifying fragments for media objects on the Web, such as video, audio, and images. </li></ul>
    88. 90. W3C Media Fragments WG <ul><li>Research question: </li></ul><ul><li>How to access media fragments without retrieving the whole resource? </li></ul><ul><li>Use Case and Requirements </li></ul><ul><ul><li>Link to, Display, Bookmark, Annotate, Search </li></ul></ul><ul><li>Four dimensions considered </li></ul><ul><ul><li>time, space, track, id </li></ul></ul><ul><li>Protocols covered: HTTP(S), RTSP </li></ul>http://www.w3.org/TR/media-frags-reqs
    89. 91. Media Fragment Processing (1) <ul><li>User Agent (media fragment conforming) : http://www.youtube.com/watch?v=sxoh1z6s_Cw#t=12,21 </li></ul><ul><ul><li>GET / watch?v =sxoh1z6s_Cw HTTP/1.1 Host: www.youtube.com Accept: video/* Range: seconds =12-21 [ HTTPBis clarification ] </li></ul></ul><ul><li>Server: </li></ul>
    90. 92. Media Fragment Processing (2) <ul><li>User Agent (media fragment conforming) : http://www.youtube.com/watch?v=sxoh1z6s_Cw#t=12,21 </li></ul><ul><li>Server: </li></ul>
    91. 93. 2. Media Annotations <ul><li>Using structured knowledge on the Web :Clip foaf:depicts dbpedia:Laughter :Clip foaf:depicts dbpedia:Boris_Yeltsin :Clip foaf:depicts dbpedia:Bill_Clinton :Clip foaf:depicts dbpedia:Hyde_Park,New_York ---------------------------------------------- dbpedia:Hyde_Park,New_York owl:sameAs fbase:hyde_park fbase:hyde_park skos:broader fbase:new_york_state </li></ul><ul><li>Annotate the content (interpretation) Boris Yeltsin, Bill Clinton, laugh, Bosnia, Hyde Park </li></ul>
    92. 94. Media Annotations <ul><li>Using structured knowledge on the Web :Reg wsmo:partOf flickr:mhausenblas/2883727293/ :Reg foaf:depicts http://saphira.blogr.com/#me </li></ul>
    93. 95. 3. What For? [More details in Parts 2 & 3] <ul><li>Answer abstract queries </li></ul><ul><ul><li>Find all sequences with a Russian President laughing with another President </li></ul></ul><ul><li>Find (non obvious) connection between pieces of media </li></ul><ul><li>Present Information </li></ul>
    94. 96. Answer abstract queries <ul><li>Research Problems </li></ul><ul><ul><li>Data modeling, vocabulary alignment, disambiguation </li></ul></ul>PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?Clip WHERE { ?Clip foaf:depicts dbpedia:Laughter , yago:PresidentsOfTheRussianFederation , yago:President110468559 . }
    95. 97. Find connection between media <ul><li>Unexpected relationships: enable further discovery, exploration </li></ul><ul><li>Research problems </li></ul><ul><ul><li>Where should we stop in the exploration? </li></ul></ul><ul><ul><li>When does it start to be intrusive for the end-user? </li></ul></ul>:Clip foaf:depicts dbpedia:Boris_Yeltsin :Clip foaf:depicts dbpedia:Bill_Clinton :Clip foaf:depicts fbase:Laughter
    96. 98. Presenting Information <ul><li>Dimensions used for searching news items </li></ul><ul><ul><li>When time 10/07/2006 </li></ul></ul><ul><ul><li>Where location Paris </li></ul></ul><ul><ul><li>What is depicted J. Chirac, Z. Zidane </li></ul></ul><ul><ul><li>Why event WC 2006 </li></ul></ul><ul><ul><li>Who photographer Bertrand Guay, AFP </li></ul></ul>Metadata
    97. 103. Literature <ul><li>Michael Hausenblas, Raphaël Troncy, Yves Raimond and Tobias Bürger. Interlinking Multimedia: How to Apply Linked Data Principles to Multimedia Fragments . In 2nd Workshop on Linked Data on the Web (LDOW'09) , Madrid, Spain, April 20, 2009. </li></ul><ul><li>Raphaël Troncy, Jack Jansen, Yves Lafon, Erik Mannens, Silvia Pfeiffer, Davy van Deursen. Use cases and requirements for Media Fragments . W3C Working Draft, Media Fragments Working Group, 30 April 2009. </li></ul>
    98. 104. Thanks for your attention http:// www.cwi.nl/~troncy

    ×