Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

8,421 views

Published on

Our presentation at MMM 2013, Huangshan, China.

Published in:
Technology

No Downloads

Total views

8,421

On SlideShare

0

From Embeds

0

Number of Embeds

7,008

Shares

0

Downloads

22

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Semi-supervised concept detection by learningthe structure of similarity graphsSymeon Papadopoulos1, Christos Sagonas1, Ioannis Kompatsiaris1, Athena Vakali21 Centre for Research and Technology Hellas, Information Technologies Institute2 Aristotle University of Thessaloniki, Informatics Department 19th International Conference on Multimedia Modeling Huangshan, China, Jan 7-9, 2012
- 2. IMAGE TAGS CONCEPTS chocolate cake food chocolateganachebuttercream shamsd female N/A indoor people portrait nature landscape clouds water lake reflection sky mirror water flickrelite abigfave SOURCE: MIR-Flickr mklab.iti.gr #2
- 3. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #3
- 4. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #4
- 5. Concept detectionML perspective• Given an image, produce a set of relevant conceptsIR perspective• Given an image collection and a concept of interest, rank all images in order of relevance. mklab.iti.gr #5
- 6. Semi-supervised learning• Transductive learning setting target concepts annotated set D-dimensional feature vector from image i concept indicator vector (labels) for image i set of unknown items Predict concepts associated with items of by processing together and . mklab.iti.gr #6
- 7. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #7
- 8. Related work• Neighborhood similarity (Wang et al., 2009) – Uses image similarity graphs in combination with graph-based SSL (Zhu, 2005; Zhou et al., 2004) – Not incremental• Sparse similarity graph by convex optim. (Tang et al., 2009) – Applicable to online settings - Computationally intensive training step• Hashing-based graph construction (Chen et al., 2010) – Uses KL divergence multi-label propagation, but relies on iterative computational scheme – Difficult to apply in incremental settings• Social dimensions (Tang & Liu, 2011) – Uses LEs for networked classification problems (i.e. when network between nodes is explicit) – Not incremental, not applied to multimedia mklab.iti.gr #8
- 9. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #9
- 10. Graph Structure Features (GSF) mklab.iti.gr #10
- 11. Graph construction image similarity graph set of nodes-images cardinality of node setConstruction options• full weighted graph• kNN graph (connect k most similar images)• εNN graph (connect images < similarity threshold) mklab.iti.gr #11
- 12. Eigenvector/value computation Normalized graph Laplacian degree matrix (diagonal) adjacency matrix (typical form of graph Laplacian: ) non-zero eigenvalues graph structure features* by solving *aka Laplacian Eigenmaps mklab.iti.gr #12
- 13. Graph structure feature learning• Each media item is represented by a vector• At this point, any supervised learning method could be used. [note that the whole framework is still SSL since unlabeled items are used during graph construction]• SVM is selected – good performance in several problems – good implementations available (LibSVM, LIBLINEAR) – real-valued output (IR perspective rank images by concept) mklab.iti.gr #13
- 14. Intuition coast coast, person coast 0.2415 -0.4552 coast, person coast, person 0.3077 coast -0.0893 -0.4552 0.2748 0.3144 -0.4663 coast 0.2415 coast coast, person 2nd eigenvector of graph Laplacian mklab.iti.gr #14
- 15. Incremental learning setting (1)• Transductive learning setting often impractical. For each new set of unlabeled items: 1. recompute image similarity matrix 2. recompute graph structure features (LEs) 3. use SVM to obtain prediction scores• Step 2 is computationally expensive.• Devise two incremental schemes: – Linear Projection (LP) : set of k most similar images – Submanifold Analysis (SA) [cf. next slide] mklab.iti.gr #15
- 16. Incremental learning setting (2)• Submanifold Analysis [Jia et al., 2009] – Construct (k+1)x(k+1) similarity matrix WS between new item and k most images from the annotated set – Construct sub-diagonal and sub-Laplacian matrices – Compute eigenvalues and d eigenvectors corresponding to non-zero eigenvalues [computation is lightweight since k << n] – Minimize reconstruction error: – Reconstruct approximate eigenvectors: mklab.iti.gr #16
- 17. Fusion of multiple features Graph struct. feature fusion (F-GSF) Feature fusion (F-FEAT)Similarity graph fusion (F-SIM) Result fusion (F-RES) mklab.iti.gr #17
- 18. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #18
- 19. Synthetic data - experiments• Use of four 2D distributions with limited number of samples (thousands) to test many settings TWO MOONS LINES CIRCLES GAUSSIANS• Performance aspects – Parameters of approach: number of features (CD), graph construction technique (kNN, εNN) and parameters (k, ε) – Learning setting (training size, data noise, nr. of classes) – Inductive learning (LP vs SA) – Fusion method mklab.iti.gr #19
- 20. Role of number of GSF (CD) TWO MOONS LINES noise levels CIRCLES GAUSSIANS higher CD better mAP higher noise higher CD mklab.iti.gr #20
- 21. Role of graph construction technique kNN εNN kNN better and less sensitive than εΝΝ mklab.iti.gr #21
- 22. Role of noise (σ) TWO MOONS LINES competing CIRCLES methods GAUSSIANS In most cases GSF equal or better than the expensive SVM-RBF. mklab.iti.gr #22
- 23. Role of training samples (α%) TWO MOONS LINES CIRCLES GAUSSIANS In most cases few training samples (2-5%) are sufficient for high accuracy. mklab.iti.gr #23
- 24. Number of classes (K) LINES CIRCLES Sufficiently good accuracy wrt. number of classes (much better than linear SVM, a bit worse than SVM-RBF). mklab.iti.gr #24
- 25. Scalability wrt. number of features Linearly increasing cost wrt. dimensionality Constant cost wrt. dimensionality mklab.iti.gr #25
- 26. Comparison between fusion methods LINES CIRCLES Even when one feature goes bad, result and GSF fusion still do better than the best. mklab.iti.gr #26
- 27. Incremental schemes SA much better and less sensitive than LP. TWO MOONS LINES CIRCLES GAUSSIANS mklab.iti.gr #27
- 28. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #28
- 29. Experimental setting• MIR-Flickr – 25,000 images + tags – 38 concepts (24 + 14 with two interpretations [strict/rel])• Benchmark methods – Semantic Spaces (SESPA) [Hare & Lewis, 2010] – Multiple Kernel Learning (MKL) [Guillaumin et al., 2010] mklab.iti.gr #29
- 30. GSF vs SESPAGSF-F1, F2, F3: Single feature GSFGSF-C: Graph structure feature fusionGSF-D1, D2: Result fusion using LIBLINEAR (1) and RBF (2) mklab.iti.gr #30
- 31. GSF vs MKL VISUAL MKL better in: baby, bird, river, sea. Possible thanks to scalable behavior wrt. TAG number of features. GSF better in: baby, bird, car, dog, river, sea. mklab.iti.gr #31
- 32. Example results mklab.iti.gr #32
- 33. Evaluation: adding unlabeled samples (1) ~6% relative increase in mAP GIST mklab.iti.gr #33
- 34. Evaluation: adding unlabeled samples (2) ~12% relative increase in mAP DenseSiftV3H1 mklab.iti.gr #34
- 35. Evaluation: adding unlabeled samples (3) ~4% relative increase in mAP TagRaw50 mklab.iti.gr #35
- 36. Overview• Problem formulation• Related work• Graph Structure Features Approach• Evaluation – Synthetic datasets – MIR-Flickr• Conclusions mklab.iti.gr #36
- 37. Conclusions• Concept detection approach based on the structure of image similarity graphs – Transductive learning setting – Two variants for online learning• Thorough experimental analysis – Behavior under a variety of settings/parameters – Equivalent or better behavior compared to SoA approaches• Fast: – SA with k=5 takes 38.4msec per image (not incl. feature extraction) – Future work: further analysis of computational characteristics + application to larger scale datasets (NUS-Wide, ImageNet) mklab.iti.gr #37
- 38. Thank youFurther contact: papadop@iti.gr www.socialsensor.eu mklab.iti.gr #38
- 39. References (1)• Graph-based semi-supervised learning Zhu, X.: Semi-supervised learning with graphs. PhD Thesis, Carnegie Mellon University, 0-542-19059-1 (2005) Zhou, D., Bousquet, O., Navin Lal, T., Weston, J. Schoelkopf, B.: Learning with Local and Global Consistency. Advances in NIPS 16, MIT Press (2004) 321-328• Related approaches Wang, M., Hua, X.-S. Tang, J., Hong, R.: Beyond distance measurement: constructing neighborhood similarity for video annotation. TMM 11 (3) (2009), 465-476 Tang, J. et al.: Inferring semantic concepts from community contributed images and noisy tags. ACM Multimedia (2009) 223-232 Chen, X. et al.: Efficient large scale image annotation by probabilistic collaborative multi-label propagation. ACM Multimedia (2010), 35-44 Tang, L., Liu, H.: Leveraging social media networks for classification. Data Mining and Knowledge Discovery 23 (3) (2011), 447-478 mklab.iti.gr #39
- 40. References (2)• Relational classification Macskassy, S.A., Provost, F.: Classification in Networked Data: A Toolkit and a Univariate Case Study. Journal of Machine Learning Research 8, (2007), 935-983• Laplacian Eigenmaps Mikhail, B., Partha, N.: Laplacian Eigenmaps for dimensionality reduction and data representation. Neural Computing 15 (6), MIT Press (2003) 1373-1396 Jia, P., Yin, J., Huang, X., Hu, D.: Incremental Laplacian eigenmaps by preserving adjacent information between data points. PR Letters 30 (16) (2009), 1457–1463 mklab.iti.gr #40
- 41. References (3)• Tools Leyffer, S., Mahajan, A.: Nonlinear Constrained Optimization: Methods and Software. Preprint ANL/MCS-P1729-0310 (2010) Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A Library for Large Linear Classification. Journal of ML Research 9 (2008), 1871-1874 Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2 (3) (2011), 27:1–27:27• Dataset Huiskes, M.J., Michael S. Lew, M.S.: The MIR Flickr Retrieval Evaluation. Proceedings of ACM Intern. Conf. on Multimedia Information Retrieval (2008)• Competing methods Hare, J.S., Lewis, P.H.: Automatically annotating the MIR Flickr dataset. ACM ICMR (2010), 547-556 Guillaumin, M., Verbeek, J., Schmid, C.: Multimodal semi supervised learning for image classification. Proceedings of IEEE CVPR Conference (2010), 902-909 mklab.iti.gr #41
- 42. mklab.iti.gr #42

No public clipboards found for this slide

Be the first to comment