Btv thesis defense_v1.02-final

551 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
551
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Btv thesis defense_v1.02-final

  1. 1. Improvement of Content-Based Image Retrieval by Using Clustering and Relevance Feedback Master Thesis Defense Bui The Vinh May 13, 2010
  2. 2. Content  Introduction  Image’s Features & Similarity  Clustering Algorithm  Relevance Feedback  Implementation and Evaluation  Conclusions and Future Work 2
  3. 3. Introduction 3  Key points  How to represent an image  How to determine whether two images are similar or not  Framework
  4. 4. Introduction 4  Practical Applications  Medical diagnosis  Crime prevention  Online shopping  Etc.  Challenges  Real-time system  High accuracy  Contributions  Build a complete CBIR system  Improve the searching time by using clustering  Increase the accuracy by applying support vector machine in Relevance Feedback
  5. 5. Content  Introduction  Image Features & Similarity  Clustering Algorithm  Relevance Feedback  Implementation and Evaluation  Conclusions and Future Work 5
  6. 6. Feature Extraction Model 6 F1 B F2 F3  Basic Image features: COLOR, SHAPE, TEXTURE
  7. 7. Image Representation 7  Image representation  CEDD: Color and edge directivity descriptor (proposed by Chatzichristofis and Boutalis)  Incorporate color and texture information in a histogram  Each image is represented by a high dimensional real vector 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0  Vectors representing images depend on the method of extracting image features
  8. 8. Similarity Measurement 8  Formula  Calculate the distance between two corresponding vectors  Tanimoto distance F1 F3 F2
  9. 9. Content Introduction Image’s Features Clustering Algorithm Relevance Feedback Implementation and Evaluation Conclusions and Future Work 9
  10. 10. Overview of Clustering 10  Motivation  The amount of image data involved is very large  Finding groups of objects such that:  The objects in a group will be similar to one another  The objects in a group will be different from the objects in other groups
  11. 11. K-means Clustering 11  Definition  K-means is a partition clustering algorithm based on iterative relocation that partitions a dataset into k clusters.  Objective  Locally minimizes sum of squared distance between the data points and their corresponding cluster centers:  Given a set of observations (x1, x2, …, xn); Cluster into k sets (k < n) X = {X1, X2, …, Xk}
  12. 12. K-means Clustering (2) 12  Algorithm  Initialize k cluster centers randomly. Repeat until it converges:  Cluster Assignment Step: Assign each data point xi to the cluster fh such that distance of xi from center of fh is minimum  Center Re-estimation Step: Re-estimate each cluster center as the mean of the points in that cluster
  13. 13. Content Introduction Image’s Features Clustering Algorithm Relevance Feedback Implementation and Evaluation Conclusions and Future Work 13
  14. 14. Relevance feedback? 14  Motivation  The limitation of low-level image feature-based searching  Mechanism  After initial retrieval results are presented, allow the user to provide feedback on the relevance of one or more of the retrieved images.  Use this feedback information to reformulate the query.  Produce new results based on reformulated query.  Challenges  Require real-time processing  Training data set is small
  15. 15. RF Architecture 15 Rankings CBIR System Ranked Images 1. Img1 2. Img2 3. Img3 . . 1. Img1  2. Img2  3. Img3  . . Feedback Query Image Revised Query Re-Ranked Images 1. Img2 2. Img4 3. Img5 . . Query Reformulation Images Database
  16. 16. Support vector machine 16  Classification method  Given a set of training examples, each marked as belonging to one of two categories  An SVM training algorithm builds a model that predicts whether a new example falls into one category or the other.  Linear Case  Training data  A separating hyperplane  Optimal separating hyperplane (OSH)
  17. 17. Support vector machine (2) 17  Linear Case (cont.)  The classification function  Non-linear Case  The classification function  Kernels
  18. 18. Content Introduction Image’s Features Clustering Algorithm Relevance Feedback Implementation and Evaluation Conclusions and Future Work 18
  19. 19. Clustering Implementation 19  Clustering  Take feature vectors database as input  Apply K-means algorithm to cluster the database  Finding  Find appropriate cluster with the query image
  20. 20. RF Implementation 20  Support vector machine classifier  Suitable when number of training data is small  Can be applied in a real-time system
  21. 21. Environment & Parameters 21  Environment  9918 images with various kinds of images  Desktop computer: Intel Core 2 Dual 3.16 GHz, 4-GB RAM, Windows 7 Ultimate  Sun Java 1.6-u7  All components of the system are implemented by using Java  Parameters  Choose K=7 for K-means algorithm  Choose radical basis function (RBF) for support vector machine
  22. 22. Clustering Evaluation 22  Accuracy  Clustering does not adversely affect the accuracy
  23. 23. Clustering Evaluation 23  Searching time Applying clustering significantly improves the performance
  24. 24. RF Evaluation 24  Accuracy  Improve the accuracy after several iterations
  25. 25. Content Introduction Image’s Features Clustering Algorithm Relevance Feedback Implementation and Evaluation Conclusions and Future Work 25
  26. 26. Conclusion 26  Achievements  Successfully build a complete content-based image retrieval system  The performance is significantly improved by applying K-means clustering algorithm to cluster image database  Using support vector machine in “Relevance Feedback” can remarkably increase the accuracy  Shortcomings  Low-level feature-based searching method depends on other authors’ method  Future works  Develop a low-level feature-based searching method that is suitable with each kind of images domain
  27. 27. 27

×