• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content


Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

MultiModal Retrieval Image






Total Views
Views on SlideShare
Embed Views



25 Embeds 522

http://www.zagoris.gr 169
http://localhost 156
http://savvash.blogspot.com 104
http://zagoris.gr 35
http://savvash.blogspot.in 17
http://savvash.blogspot.co.uk 6
http://savvash.blogspot.com.es 4
http://savvash.blogspot.com.br 3
http://savvash.blogspot.de 3
http://savvash.blogspot.fr 3
http://savvash.blogspot.sg 2
http://savvash.blogspot.kr 2
http://savvash.blogspot.gr 2
http://savvash.blogspot.jp 2
http://translate.googleusercontent.com 2
http://savvash.blogspot.com.au 2
http://savvash.blogspot.it 2
http://www.linkedin.com 1
http://savvash.blogspot.se 1
http://savvash.blogspot.ro 1
http://savvash.blogspot.nl 1
http://savvash.blogspot.pt 1
http://savvash.blogspot.ch 1
http://savvash.blogspot.tw 1
http://savvash.blogspot.sk 1


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    MultiModal Retrieval Image MultiModal Retrieval Image Presentation Transcript

    • MMRetrieval.net
      A Multimodal Search Engine
    • MultimodalInformation
      Single language text-only retrieval reach a limit.
      Content-based Image Retrieval is computational costly and still in infancy stages.
      Digital Information is increasingly becoming multimodal
    • Modality
      Dictionary: A tendency to conform to a general pattern or belong to a particular group or category.
      Definition ofModality in Information Retrieval
      It is unclear, fuzzy
      1st Definition: Modality = Media
      2nd Definition: Modality = Data Stream
    • MMRetrieval.net
      A Product of Cooperation
      Started June, 2010
      AviArampatzis, Lecturer D.U.T.H.
      KonstantinosZagoris, ph.D. D.U.T.H
      Savvas A. Chatzichristofis, ph.D. candidate D.U.T.H.
    • ImageCLEF2010Wikipedia RetrievalTask
      ImageCLEF 2010 Wikipedia Collection
      Consisting of 237434 items
      Image Primary Media
      Noisy and Incomplete User Supplied Textual Annotations
      Wikipedia Articles Containing the Images
      Written in any combination of English, German, French, or any other unidentified language
    • Wikipedia Collection
      <name>Balloons Festival - Chateaux d'Oex.jpg</name>
      <captionarticle="text/en/4/331622">Balloon festival </caption>
      <comment>(Balloon festival in Chateaux d'Oex. Category:Chateau d'OexCategory:Hot air balloons) </comment>
    • ImageCLEF 2010Wikipedia RetrievalTask
      70 test topics
      consisting of a textual and a visual part
      three title fields (one per language—English, German, French)
      one or more example images
    • Wikipedia Topic
      <titlexml:lang="en">tennis player on court</title>
      <titlexml:lang="de">tennisspieler auf dem platz</title>
      <titlexml:lang="fr">joueur de tennis sur le terrain</title>
    • Extraction of Modalities
      Joint Composite Descriptor (JCD)
      Spartial Color Distribution (SpCD)
      Lemur Toolkit V4.11 and Indri V2.11 with the tf.idf retrieval model
    • MMRetrieval.net Structure
    • Fusion in Information Retrieval
      combining evidence about relevance from different sources of information
      from several modalities
      fusion consists of two components
      score normalization
      score combination
    • Score Normalization
      the relevance scores are not comparable
      popular text retrieval models (tf.idf) can be turned to probabilities of relevance via the score-distributional method
      image descriptors does not fit
      MinMax (maps linearly to the [0,1] )
      Zscore (maps to the number of standard deviations it lies above or below the mean score)
      non-linear Known-Item Aggregate Cumulative Density Function (KIACDF)
    • Score Combination
    • Results
    • Corrected Results
    • Fusion Results
    • Fusion Problems
      appropriate weighing of modalities and score normalization/combination are not trivial problems
      if results are assessed by visual similarity only, fusion is not a theoretically sound method
    • Content-based Image Retrieval Problems
      Content-based Image Retrieval (CBIR) with global features is notoriously noisy for image queries of low generality, i.e. the fraction of relevant images in a collection.
      does not scale up well to large databases efficiency-wise
    • Two – Stage Image Retrieval
      how it works: first use the secondary modality to rank the collection then perform CBIR only on the top-K items
      assumption: primary (image) – secondary (text) modalities
      hypothesis: CBIR can do better than text retrieval in small sets or sets of high query generality
      efficient benefit: Using a ‘cheaper’ secondary modality, this improves also efficiency by cutting down on costly CBIR operations
      possible drawback: relevant images with empty or very noise secondary modalities would be completely missed
    • Previous Work
      Best results re-ranking by visual content has been seen before
      mostly in different setups
      All these approaches employed a static predefined K for all queries
      not clear if it works
    • Our Two-Stage Method
      dynamic K
      calculated dynamically per query
      optimize a predefined effectiveness measure
      without using external information or training data
    • Retrieval Results
      Image Only
      cockpit of an airplane
      Text Only
      Static K=25
      Dynamic K
    • Best Fusion Method – Max of Sums
      i the index running over example images (i=1,2,…)
      j running over the visual descriptors (𝑗∈{1,2})
      DESCji is the score against the ith example image for the jth descriptor
      parameter w controls the relative contribution of the two media
    • Fusion vs Two-Stage
    • Implementation
      • developed in the C#/.NET Framework 4.0
      • HTML, CSS and JavaScript (AJAX) technologies for the interface
      • requires a fairly modern browser
    • Directions for Further Research
      Multi-stage retrieval for multimodal databases based on modality hierarchy.
      Fuzzy Fusion (replace w with membership function m).
      Create artificial modalities (not only from relevance scores)
      pseudo relevance feedback – cross media feedback
    • Publications
      Multimedia Search with Noisy Modalities: Fusion and Multistage Retrieval. AviArampatzis, Savvas A. Chatzichristofis, and KonstantinosZagoris. In: CLEF (Notebook Papers/LABs/Workshops), 22-23 September, Padua, Italy, 2010.
      www.MMRetrieval.net: A Multimodal Search Engine.KonstantinosZagoris, AviArampatzis, and Savvas A. Chatzichristofis. In: Proceedings of the 3rd International Conference on SImilarity Search and APplications, SISAP 2010, Istanbul, Turkey, September 18-19, 2010. © Association for Computing Machinery (ACM).
    • Ευχαριστώ!