OntoGen Extension for Exploring Image Collections

516 views

Published on

The paper was delivered in the Proceedings of the 2011 IEEE 7th International Conference on Intelligent Computer Communication and Processing (ICCP 2011) on August 26th, 2011 in Cluj-Napoca, Romania.

Publication: http://bit.ly/GDIsRY

Abstract:
OntoGen is a semi-automatic and data-driven ontology editor focusing on editing of topic ontologies. It utilizes text mining tools to make the ontology-related tasks simpler to the user. This focus on building ontologies from textual data is what we are trying to bridge. We have successfully extended OntoGen to work with image data and allow for ontology construction and editing on collections of labeled or unlabeled images. Browsing large heterogeneous image collections efficiently is certainly a challenging task and we feel that semiautomatic ontology construction, as described in this paper, makes this task easier.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
516
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

OntoGen Extension for Exploring Image Collections

  1. 1. VISUALIZING IMAGE COLLECTIONSWITH ONTOGENFrom Images To Ontologies
  2. 2. IMAGE DATA Difficult to handle High-dimensional representations The amount of image data is constantly increasing and there is a rising need for reliable automatic image analysis systems in practical applications
  3. 3.  Image representation Application Data Mining Extract features Text Color info SIFT features
  4. 4. SIFT FEATURES Rotation, scale and translation invariant orientation gradients located at “interesting” points on an image Usually, the SIFT feature space is quantized so that some “representative” vectors are found Each feature on an observed image is then assigned to its nearest representative and this is how the so called “codebook” histogram is obtained
  5. 5. COLOR HISTOGRAMS Color information on an image might or might not be of interest for a particular problem, but it usually represents a useful piece of information There are several ways to handle this information, but the simplest and fastest one is to simply divide the color spectrum into “buckets” and calculate the distribution of colors into these buckets, thereby obtaining the color histogram for an image
  6. 6. ONTOGEN OntoGen is a tool which allows us to do semi- automatic ontology construction, clustering, classification, as well as data visualization via multidimensional scaling This can easily be applied on image data to gain an overview of collections of images
  7. 7. IMAGE FEATURE EXTRACTION We extract SIFT features and color histograms for each image We calculate the distance between images as the weighted sum of distances between the two distributions (SIFT codebook and color data) If images have annotations, this can easily be incorporated by adding a third part in the representation for each image
  8. 8. ONTOGEN ON IMAGE DATA On the next few slides we show the usage of OntoGen on one simple data set The data was taken from ImageNet online image collection. The particular subset contains images of various types of flowers, as well as images of fire and images of buildings
  9. 9. MAIN WINDOW WHEN THE COLLECTION ISLOADED
  10. 10. DOCUMENT LIST FOR QUICK OVERVIEW
  11. 11. DOCUMENT ATLAS WHEN NOT DISPLAYINGIMAGES
  12. 12. DOCUMENT ATLAS WHEN DISPLAYING IMAGES
  13. 13. CREATING AN ONTOLOGY We can do k-means clustering to detect groups of similar images We can use these groups to create a level in the ontology The relevant features are displayed on top of the nodes
  14. 14. SO, LET’S LOOK AT SOME OF THOSE NODESAND THEIR MEDOIDS…PRETTY GOOD…
  15. 15. HOWEVER… One of the first-level sub-concepts is not good, which can be seen by observing it’s medoids: So, now we can branch it further into more refined sub-concepts to improve the quality
  16. 16. BEFORE WE DO SO, WE CAN VISUALIZE THESUB-CONCEPT IN DOCUMENT ATLAS
  17. 17. SO … This is definite evidence that the concept should be split into at least two different sub-concepts Most of the images inside it represent buildings, but there are some that belong to a certain type of flower, as well as some depicting fire So, just to be safe, let’s say we want 5 sub- concepts
  18. 18. THIS IS HOW THE NEW ONTOLOGY WILL LOOKLIKE:
  19. 19. AND THE MEDOIDS FOR THE FIVE NEWREFINED SUB-CONCEPTS ARE:
  20. 20. CONCLUSIONS What we see is that we can construct an image ontology in a semi-supervised way By using k-means clustering based on SIFT+color image representation we can detect candidates for concepts in the ontology and then refine them until we reach good quality
  21. 21. AKNOWLEDGEMENTS Thiswork was supported by the bilateral project between Slovenia and Romania “Understanding Human Behavior for Video Survailance Applications,” the Slovenian Research Agency and the ICT Programme of the EC PlanetData (ICTNoE-257641).

×