Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Like this? Share it with your network


MultiModal Retrieval Image






Total Views
Views on SlideShare
Embed Views



26 Embeds 538

http://www.zagoris.gr 178
http://localhost 156
http://savvash.blogspot.com 106
http://zagoris.gr 35
http://savvash.blogspot.in 17
http://savvash.blogspot.co.uk 7
http://savvash.blogspot.gr 5
http://savvash.blogspot.com.es 4
http://savvash.blogspot.com.br 3
http://savvash.blogspot.fr 3
http://savvash.blogspot.de 3
http://savvash.blogspot.sg 2
http://savvash.blogspot.com.au 2
http://savvash.blogspot.kr 2
http://translate.googleusercontent.com 2
http://savvash.blogspot.jp 2
http://savvash.blogspot.it 2
http://savvash.blogspot.tw 1
http://savvash.blogspot.sk 1
http://savvash.blogspot.ch 1
http://savvash.blogspot.pt 1
http://savvash.blogspot.nl 1
http://savvash.blogspot.ro 1
http://savvash.blogspot.se 1
http://www.linkedin.com 1
http://savvash.blogspot.cz 1


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

MultiModal Retrieval Image Presentation Transcript

  • 1. MMRetrieval.net
    A Multimodal Search Engine
  • 2. MultimodalInformation
    Single language text-only retrieval reach a limit.
    Content-based Image Retrieval is computational costly and still in infancy stages.
    Digital Information is increasingly becoming multimodal
  • 3. Modality
    Dictionary: A tendency to conform to a general pattern or belong to a particular group or category.
    Definition ofModality in Information Retrieval
    It is unclear, fuzzy
    1st Definition: Modality = Media
    2nd Definition: Modality = Data Stream
  • 4. MMRetrieval.net
    A Product of Cooperation
    Started June, 2010
    AviArampatzis, Lecturer D.U.T.H.
    KonstantinosZagoris, ph.D. D.U.T.H
    Savvas A. Chatzichristofis, ph.D. candidate D.U.T.H.
  • 5. ImageCLEF2010Wikipedia RetrievalTask
    ImageCLEF 2010 Wikipedia Collection
    Consisting of 237434 items
    Image Primary Media
    Noisy and Incomplete User Supplied Textual Annotations
    Wikipedia Articles Containing the Images
    Written in any combination of English, German, French, or any other unidentified language
  • 6. Wikipedia Collection
    <name>Balloons Festival - Chateaux d'Oex.jpg</name>
    <captionarticle="text/en/4/331622">Balloon festival </caption>
    <comment>(Balloon festival in Chateaux d'Oex. Category:Chateau d'OexCategory:Hot air balloons) </comment>
  • 7. ImageCLEF 2010Wikipedia RetrievalTask
    70 test topics
    consisting of a textual and a visual part
    three title fields (one per language—English, German, French)
    one or more example images
  • 8. Wikipedia Topic
    <titlexml:lang="en">tennis player on court</title>
    <titlexml:lang="de">tennisspieler auf dem platz</title>
    <titlexml:lang="fr">joueur de tennis sur le terrain</title>
  • 9. Extraction of Modalities
    Joint Composite Descriptor (JCD)
    Spartial Color Distribution (SpCD)
    Lemur Toolkit V4.11 and Indri V2.11 with the tf.idf retrieval model
  • 10. MMRetrieval.net Structure
  • 11. Fusion in Information Retrieval
    combining evidence about relevance from different sources of information
    from several modalities
    fusion consists of two components
    score normalization
    score combination
  • 12. Score Normalization
    the relevance scores are not comparable
    popular text retrieval models (tf.idf) can be turned to probabilities of relevance via the score-distributional method
    image descriptors does not fit
    MinMax (maps linearly to the [0,1] )
    Zscore (maps to the number of standard deviations it lies above or below the mean score)
    non-linear Known-Item Aggregate Cumulative Density Function (KIACDF)
  • 13. Score Combination
  • 14. Results
  • 15. Corrected Results
  • 16. Fusion Results
  • 17. Fusion Problems
    appropriate weighing of modalities and score normalization/combination are not trivial problems
    if results are assessed by visual similarity only, fusion is not a theoretically sound method
  • 18. Content-based Image Retrieval Problems
    Content-based Image Retrieval (CBIR) with global features is notoriously noisy for image queries of low generality, i.e. the fraction of relevant images in a collection.
    does not scale up well to large databases efficiency-wise
  • 19. Two – Stage Image Retrieval
    how it works: first use the secondary modality to rank the collection then perform CBIR only on the top-K items
    assumption: primary (image) – secondary (text) modalities
    hypothesis: CBIR can do better than text retrieval in small sets or sets of high query generality
    efficient benefit: Using a ‘cheaper’ secondary modality, this improves also efficiency by cutting down on costly CBIR operations
    possible drawback: relevant images with empty or very noise secondary modalities would be completely missed
  • 20. Previous Work
    Best results re-ranking by visual content has been seen before
    mostly in different setups
    All these approaches employed a static predefined K for all queries
    not clear if it works
  • 21. Our Two-Stage Method
    dynamic K
    calculated dynamically per query
    optimize a predefined effectiveness measure
    without using external information or training data
  • 22. Retrieval Results
    Image Only
    cockpit of an airplane
    Text Only
    Static K=25
    Dynamic K
  • 23. Best Fusion Method – Max of Sums
    i the index running over example images (i=1,2,…)
    j running over the visual descriptors (𝑗∈{1,2})
    DESCji is the score against the ith example image for the jth descriptor
    parameter w controls the relative contribution of the two media
  • 24. Fusion vs Two-Stage
  • 25. Implementation
    • developed in the C#/.NET Framework 4.0
    • 26. HTML, CSS and JavaScript (AJAX) technologies for the interface
    • 27. requires a fairly modern browser
  • Directions for Further Research
    Multi-stage retrieval for multimodal databases based on modality hierarchy.
    Fuzzy Fusion (replace w with membership function m).
    Create artificial modalities (not only from relevance scores)
    pseudo relevance feedback – cross media feedback
  • 28. Publications
    Multimedia Search with Noisy Modalities: Fusion and Multistage Retrieval. AviArampatzis, Savvas A. Chatzichristofis, and KonstantinosZagoris. In: CLEF (Notebook Papers/LABs/Workshops), 22-23 September, Padua, Italy, 2010.
    www.MMRetrieval.net: A Multimodal Search Engine.KonstantinosZagoris, AviArampatzis, and Savvas A. Chatzichristofis. In: Proceedings of the 3rd International Conference on SImilarity Search and APplications, SISAP 2010, Istanbul, Turkey, September 18-19, 2010. © Association for Computing Machinery (ACM).
  • 29. Ευχαριστώ!