Automatic Image Annotation

1,042 views
880 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,042
On SlideShare
0
From Embeds
0
Number of Embeds
348
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Automatic Image Annotation

  1. 1. Automatic Image Annotation and Retrieval Using the Joint Composite Descriptor Konstantinos Zagoris, Savvas A. Chatzichristofis, Nikos Papamarkos and Yiannis S. Boutalis Department of Electrical & Computer Engineering Democritus University of Thrace, Xanthi, Greece kzagoris@ee.duth.gr
  2. 2. Problem Definition  Today, capable tools are needed in order to successfully search and retrieve images.  Content-based Image Retrieval Techniques have been used such as img(Anaktisi) and img(Rummager) which employ low-level image features such as color, texture and shape in order to locate similar images.  Although the above approaches are successful, they lack the ability to include human perception in the query
  3. 3. Proposed Technique  A new image annotation technique  A keyword-based image retrieval system  Employ the Joint Composite Descriptor  Utilizes two set of keywords  One Set consists of colors – keywords  The other Set consists of words  The queries can be more naturally specified by the user
  4. 4. The Keyword Sets
  5. 5. Keyword Annotation Image Retrieval System The Overall Structure of the Proposed System
  6. 6. Joint Composite Descriptor  Belong to the family of Compact Composite Descriptors  More than one feature at the same time, in a very compact representation  Global Descriptor (global image low level features)  Fusion of the Color and Edge Directivity Descriptor (CEDD) and the Fuzzy Color and Texture Histogram (FCTH)
  7. 7. CEDD and FCTH Descriptors  The CEDD length is 54 bytes per image while FCTH length is 72 bytes per image.  The structure of these descriptors consists of n texture areas. In particular, each texture area is separated into 24 sub regions, with each sub region describing a color.  CEDD and FCTH use the same color information, as it results from 2 fuzzy systems that map the colors of the image in a 24-color custom pallete.
  8. 8. CEDD and FCTH Descriptors CEDD FCTH
  9. 9.  Color Information given by the two descriptors comes from the same fuzzy system.  CEDD uses a fuzzy version of the five digital filters proposed by the MPEG-7 Edge Histogram Descriptor.  FCTH uses the high frequency bands of the Haar wavelet Transform in a fuzzy system.  So, the joining of the descriptors is accomplished by the fusion-combination of the texture areas carried by each descriptor. CEDD and FCTH Descriptors
  10. 10.  Each descriptor can be descripted in the following way: 𝐶𝐸𝐷𝐷 𝑗 𝑛 𝑚, 𝐹𝐶𝑇𝐻 𝑗 𝑘 𝑚 𝐶𝐸𝐷𝐷 𝑗 2 5 = 𝑏𝑖𝑛 2 × 24 + 5 = 𝑏𝑖𝑛(53) Joint Composite Descriptor (JCD)
  11. 11. 𝐽𝐶𝐷 𝑗 0 𝑖 = 𝐹𝐶𝑇𝐻 𝑗 𝑜 𝑖 + 𝐹𝐶𝑇𝐻 𝑗 4 𝑖 + 𝐶𝐸𝐷𝐷 𝑗 𝑜 𝑖 2 𝐽𝐶𝐷 𝑗 1 𝑖 = 𝐹𝐶𝑇𝐻 𝑗 1 𝑖 + 𝐹𝐶𝑇𝐻 𝑗 5 𝑖 + 𝐶𝐸𝐷𝐷 𝑗 2 𝑖 2 𝐽𝐶𝐷 𝑗 2 𝑖 = 𝐶𝐸𝐷𝐷 𝑗 4 𝑖 𝐽𝐶𝐷 𝑗 3 𝑖 = 𝐹𝐶𝑇𝐻 𝑗 2 𝑖 + 𝐹𝐶𝑇𝐻 𝑗 6 𝑖 + 𝐶𝐸𝐷𝐷 𝑗 3 𝑖 2 𝐽𝐶𝐷 𝑗 4 𝑖 = 𝐶𝐸𝐷𝐷 𝑗 5 𝑖 𝐽𝐶𝐷 𝑗 5 𝑖 = 𝐹𝐶𝑇𝐻 𝑗 3 𝑖 + 𝐹𝐶𝑇𝐻 𝑗 7 𝑖 𝐽𝐶𝐷 𝑗 6 𝑖 = 𝐶𝐸𝐷𝐷 𝑗 1 𝑖 with 𝑖 ∈ 0,23 Joint Composite Descriptor (JCD)
  12. 12. Color Similarity Grade (CSG)  Defines the amount of corresponding color in the image  It is calculated from the JCD for each color keyword
  13. 13. Color Similarity Grade (CSG) 𝑏𝑙𝑎𝑐𝑘 = 𝑘=0 6 𝐽𝐶𝐷 𝑗 𝑘 2 𝑤ℎ𝑖𝑡𝑒 = 𝑘 = 0 6 𝐽𝐶𝐷 𝑗 𝑘 0 𝑔𝑟𝑎𝑦 = 𝑘 = 0 6 𝐽𝐶𝐷 𝑗 𝑘 1 𝑦𝑒𝑙𝑙𝑜𝑤 = 𝑘 = 0 6 𝐽𝐶𝐷 𝑗 𝑘 10 𝑐𝑦𝑎𝑛 = 𝑘 = 0 6 𝐽𝐶𝐷 𝑗 𝑘 16 𝑔𝑟𝑒𝑒𝑛 = 𝑘 = 0 6 𝐽𝐶𝐷 𝑗 𝑘 9 + 𝐽𝐶𝐷 𝑗 𝑘 12 + 𝐽𝐶𝐷 𝑗 𝑘 13 + 𝐽𝐶𝐷 𝑗 𝑘 14
  14. 14. Portrait Similarity Grade (PSG)  Defines the connection of the image depiction with the corresponding word  It is calculated from the normalization of a trained Support Vector Machines (SVM) Decision Function  For each word a SVM is trained using as training samples the JCD values from a small subset of the available image database
  15. 15. Support Vector Machines (SVMs)  Based on statistical learning theory  Separate the space that the training samples are resided in two classes  The new sample is classified depending where in the space residues
  16. 16. Portrait Similarity Grade (PSG)  In this work, we used the following equation to determine the membership value 𝑅(𝑥) of the sample x to the class 1:               1 1 100 max , 0 1 1 1 1 3 3 1 1 100 1 max , 0 1 1 1 1 3 3 f x f x f x f x if f x e e R x if f x e e                                        
  17. 17. Keyword Annotation Image Retrieval System  User can employ a number of keywords from both sets in order to describe the image.  For each keyword 𝐴, and initial classification position 𝑅 𝐴 is calculated based on Manhattan distance and its PSG or its CSG.  Then the final Rank is calculated for each image 𝑙 based on: 𝑅𝑎𝑛𝑘 𝐴 𝑙 = 𝑁−𝑅 𝐴 𝑁 where N is the total number of images in the database  For more keywords the sum of the Ranks for each keyword is calculated: 𝑅𝑎𝑛𝑘 𝑙 = 𝑅𝑎𝑛𝑘 𝐴 𝑙 + 𝑅𝑎𝑛𝑘 𝐵 𝑙 +𝑅𝑎𝑛𝑘 𝐶 𝑙 + ⋯
  18. 18. Implementation The proposed work is implemented as part of the img(Anaktisi) web application at http://www.anaktisi.net
  19. 19. Evaluation  The Keyword based image retrieval is tested in the Wang and the NISTER databases
  20. 20. Image Retrieval Results using the keyword “mountain” Image retrieval results using the JCD of the first image as query.
  21. 21. Conclusions  A automatic image annotation method which maps the low level features to a high level features which humans employ.  This method provides two distinct sets of keywords: colors and words.  The proposed project has been implemented and evaluated on the Wang and NISTER databases.  The results demonstrate the method's effectiveness as it retrieved results more consistent with human perception.
  22. 22. Ευχαριστώ

×