RUCoD is a formal representation of a content object, consisting of descriptions of media types associated.The acronym stands for Rich Unified Content Description. Released in its final version in the iSearch project (see it will be used and extended in CUbRIK.

  1. 1. 1 Rich Unified Content Description (RUCoD) Vincenzo Croce Research & Development LaboratorySTAR 2013 Lausanne, 18-19 January 2013
  2. 2. Objectives 2  To develop a formal description for each type of multimedia content (text, audio, image, video and 3D content)  To develop formal descriptions for real world and social information  To clearly specify the format of the Rich Unified Content Description (RUCoD)  To represent in the same format, the actual content (multimedia information) along with the additional contextual information (real world, user-related)STAR 2013 Lausanne, 18-19 January 2013
  3. 3. The Concept of Content 3 Object “A Content Object is the representation of a specific instance of either a physical object or a physical entity (an entity that has physical existence), which might have multiple views (many images, videos, audio files, text, real- world and user-related information).” Similar approaches:  Multimedia Document (MMD): a set of co-occurring multimedia objects (e.g. images, audio and text) that are of different modalities but carry the same semantics. If two multimedia objects are in the same MMD, they can be regarded as context of each other.  Multimedia Bag: defines a container including text instances, image instances and audio instances that share the shame semantic conceptsSTAR 2013 Lausanne, 18-19 January 2013
  4. 4. The Concept of Content 4 Object  A CO may consist of several media types, user-related information and real-world information.  A CO can be the result of an authoring process (e.g. using an authoring tool).  RUCoD is a formal representation of a CO consisting of descriptions of various media types somehow associated to each other.STAR 2013 Lausanne, 18-19 January 2013
  5. 5. Example Content Object 5 Great (Cheops) – 3D object Pyramid of Giza Image CO: Great (Cheops) – Pyramid of Giza Text: Real-world (location) It is believed the pyramid was built as a tomb for Fourth dynasty Egyptian pharaoh Khufu(or Cheops) and constructed over a 14 to 20 year period concludingSTAR 2013 Lausanne, 18-19 January 2013 2560 BC… around
  6. 6. RUCoD Specification 6 Real World Descriptors Position Weather Date Time (GPS, etc) (temperature) Sensors … HeaderCO ID Creator Version CO Types Multimedia RW-info U-Info … Header Low-Level Descriptors Real World User-related Descriptors Descriptors Low-Level Descriptors Text-based Image Video Audio 3D descriptors descriptors descriptors descriptors descriptors … User-related Descriptors Expressions Emotions Valence Arousal …STAR 2013 Lausanne, 18-19 January 2013
  7. 7. RUCoD Structure 7 Header L_Descriptor: -“TextType” - “Object3D” - “ImageType” - “SoundType” - “VideoType” R_Descriptor: - “ContextType” Description U_Descriptor: - “Valence” - “Arousal”STAR 2013 Lausanne, 18-19 January 2013
  8. 8. RUCoD Structure (Header) 8 <Header> <ContentObjectType>Physical Object</ContentObjectType> <ContentObjectName xml:lang="en-US">My Bulldog Barking</ContentObjectName> <ContentObjectID>3577B5EF-523F-4946-9734-C974CEA6C646</ContentObjectID> <ContentObjectVersion>1</ContentObjectVersion> <ContentObjectCreationInformation> <Creator> <Name>CERTH</Name></Creator> </ContentObjectCreationInformation> <ContentObjectTypes> <MultimediaContent type="Text"> <FreeText>It is the image, video and 3D representation… </FreeText> </MultimediaContent> <MultimediaContent type="Object3D"> <MediaName>Bulldog</MediaName> <FileFormat>x-world/x-vrml</FileFormat> <MediaLocator> <MediaUri></MediaUri> <MediaPreview></MediaPreview> </MediaLocator> </MultimediaContent> <MultimediaContent type="ImageType"> … </Header>STAR 2013 Lausanne, 18-19 January 2013
  9. 9. RUCoD Structure (L-Descriptor) 9 <L_Descriptor type="Object3D"> <MediaName>BulldogLR</MediaName> <Shape3DDescription type="CMVD" matching="MultiViewL2"> <LowLevelDescriptor totalNumOfViews="18" totalNumOfDescriptors="212" descriptorType="xsd:float" descriptorSize="3816"> <Store type="Text"> <DescriptorLocator> <DescriptorUri></DescriptorUri> </DescriptorLocator> </Store> </LowLevelDescriptor> </Shape3DDescription> </L_Descriptor> <L_Descriptor type="SoundType"> <MediaName>BulldogSound2</MediaName> <AudioDescription type="BarkBands" matching="BrayCurtis"> <LowLevelDescriptor totalNumOfDescriptors="216" descriptorType="xsd:float" descriptorSize="8 27"> <DescriptorValues> 2.31510340412e-12 3.04525744899e-11 2.56972665369e-10 6.54444409776e-09 2.40772557447e-09 2.14078905714e-08 … </DescriptorValues> </LowLevelDescriptor> </AudioDescription> </L_Descriptor>STAR 2013 Lausanne, 18-19 January 2013
  10. 10. RUCoD Structure (R-Descriptor) 10 <R_Descriptor> <RealWorldDescription type="ContextType"> <ContextSlice> <Importance>1</Importance> <DateTime> <Date>1997-07-16T19:20:30.45+01:00</Date> <Length>100</Length> </DateTime> <SubjectPosition> <gml:CircleByCenterPoint numArc="1"> <gml:pos>45.8419444 13.4002778</gml:pos> <gml:radius uom="M">10</gml:radius> </gml:CircleByCenterPoint> </SubjectPosition> <Weather> <Condition>OVC RA</Condition> <Temperature>20</Temperature> <WindSpeed>2</WindSpeed> <Humidity>94</Humidity> </Weather> </ContextSlice> </RealWorldDescription> </R_Descriptor>STAR 2013 Lausanne, 18-19 January 2013
  11. 11. RUCoD Structure (U-Descriptor) 11 <U_Descriptor type="UserType"> <MediaName>aerosmith-Aerosmith-01-Make_It.mp3</MediaName> <UserDescription matching="L2Distance" type="AvgValenceArousal"> <LowLevelDescriptor descriptorSize="1 1“ descriptorType="xsd:float" totalNumOfDescriptors="2"> <DescriptorValues>-0.4898 0.42857</DescriptorValues> </LowLevelDescriptor> </UserDescription> </U_Descriptor>STAR 2013 Lausanne, 18-19 January 2013
  12. 12. RUCoD Schema 12 • RUCoD Schema Final Version (1.4.1) released in November 2011: • The RUCoD.xsd Schema file • The RUCoD_Descriptors.xsd Schema file 2013 Lausanne, 18-19 January 2013
  13. 13. Block diagram of the I-SEARCH framework 13STAR 2013 Lausanne, 18-19 January 2013
  14. 14. Authoring & Content Analytics 14 Architecture Authoring Content AnalyticsSTAR 2013 Lausanne, 18-19 January 2013 Indexing triggering
  15. 15. RUCoD in I-SEARCH Use 15 Cases Social retrieval UC2: UC3: Furniture retrieval of music •Text •Audio (music) •3D objects •Text •Images •Video clip •Real-world (location, time) UC1: Music retrieval •Real-world (location)•Audio (music) •Emotions•Text UC4: Search for Multimedia using smartphone•Images Rich •Text•Real-world Unified •3D objects(location, time) Content •Images•Emotions Description •Real-world (location, time) UC7: Game avatar retrieval •3D objects UC5: Search for specific •Images product •Text •Video UC6: 3D game component •3D objects •Emotions retrieval •Images •Text •Video •3D objects •Audio (sounds) •Images •Real-world (location, time) •Audio (sounds) •Emotions •Real-worldSTAR 2013 Lausanne, 18-19 January 2013 (location, time)
  16. 16. Comparison with MPEG-7 16 What we use: • MediaLocator and MediaUri are used to describe the link to a specific media item. • Creator is used for description of the author of a media item. • Annotation as a part of RUCoD represents textual information of a media item or CO. • Image/Video/Audio Descriptors are used for the low-level descriptions of the separate media items within a CO. • Segment is used to describe a temporal video segment.STAR 2013 Lausanne, 18-19 January 2013
  17. 17. Comparison with MPEG-7 17 What we adapted: • ContentObjectName, ContentObjectCreationInformation instead of name and CreationInformation to represent the name and creators of COs. • TextDescription, Shape3DDescription, ImageDescription and VideoDescription, similar to MPEG-7 ContentDescription to distinguish between the descriptors of different modalities inside the same RUCoD.STAR 2013 Lausanne, 18-19 January 2013
  18. 18. Comparison with MPEG-7 18 What is new: • Cross-modal & multimodal retrieval are not entirely supported by the standards. • New types of information describing the COs are introduced, such as real- world descriptors and user-related descriptors. These enrich the CO description and improve the retrieval performance, by introducing new querying capabilities. • With respect to low-level descriptor extraction for media items, novel descriptors are introduced. As an example, for 3D content description, new state-of-the-art descriptors are introduced, which achieve higher retrieval performance than those included in MPEG-7. Similarly, new descriptors are introduced for image, video and audio content. • The low-level description of media items is also accompanied by specification of the matching scheme for each descriptor. In this case, the description scheme does not leave the responsibility for choosing the appropriate matching method to the search engine.STAR 2013 Lausanne, 18-19 January 2013
  19. 19. Comparison with JPSearch What is similar/different: • JPSearch is designed in a way that decouples the components of image search and provides a standard interface between these components. Its aim is to build a standard for interoperability among image search and retrieval systems • RUCoD specification is focused on the description of COs and it addresses a broad range of media (apart from images), real-world and user-related informationSTAR 2013 Lausanne, 18-19 January 2013
  20. 20. Comparison with MPEG-21 What we use: • A multimodal approach to media (which can be of any type) • Allowing the creation and attachment of rich metadata to digital objectsSTAR 2013 Lausanne, 18-19 January 2013
  21. 21. Comparison with MPEG-21 What is similar: • Ability to create multimedia content objects (Digital Items in MPEG-21) • Ability for content adaptation, although achieved differently (e.g. through FileFormat elements in the RUCoD) • L-Descriptors and R-Descriptors of RUCoD could be attached to MPEG-21 objects although not directly foreseen by the standardSTAR 2013 Lausanne, 18-19 January 2013
  22. 22. Comparison with MPEG-21 What is different/new: • The CO broadens the concept of Digital Item making it more general and flexible • Unification of the actual metadata and descriptors (e.g. L- Descriptors) together with real world and user-related parts in the same format. • RUCoD is particularly targeted at indexing, sharing, search and retrieval • RUCoD overcomes the traditional hierarchical object model (also foreseen in MPEG-21) allowing for more flexible and user-centric connections (e.g. RelatedSemanticConcepts field)STAR 2013 Lausanne, 18-19 January 2013
  23. 23. Ongoing Work • RUCoD initially designed to serve the needs of the I- SEARCH framework • However, it is not mature enough to be used in a wider range of applications. • RUCoD will be extended within the EU-funded project CUbRIK (CERTH and ENG are participants). • RUCoD will be adopted in CUbRIK. • A first attempt: SMILA Hackathlon, November 2011, Keiserslautern. • RUCoD was presented in the workshop • RUCoD was used as descriptor scheme to check indexing and search within the SMILA engineSTAR 2013 Lausanne, 18-19 January 2013
  24. 24. 24 Questions?STAR 2013 Lausanne, 18-19 January 2013