Your SlideShare is downloading. ×
0
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Cmap presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cmap presentation

626

Published on

The presentation accompanying to ECCV14 paper "ConceptMap: …

The presentation accompanying to ECCV14 paper "ConceptMap:
Mining noisy web data for concept learning" by Eren Golge and Pinar Duygulu

Published in: Science, Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
626
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. ConceptMap: Learning Visual Concepts from Weakly-Labeled WWW images A work by Eren Golge Supervised by Asst. Prof. Pinar Duygulu
  • 2. Dictionary ● Visual Concept – a visual correspondence of semantic values – Objects (car, bus … ), attributes (red, metallic … ) or scenes (indoor, kitchen, office …) ● Polysemy – multiple semantic matching for a given word ● Model – Classifiers in Machine Learning sense ● BoW – Bag of Words feature representation
  • 3. Problems ● Hard to have Large labeled data ● Query Web sources : Google, Bing, Yahoo etc. ● Evade polysemy or irrelevancy in the gathered data ● Deal with Domain Adaptation ● Learn salient models ● Use lower concept models -objects- to discover higher level concepts – scenes -
  • 4. General Pipeline GATHER DATA from CLUSTER and remove OUTLIERS Learn Classifiers
  • 5. Hassles ● Polysemy ● Irrelevancy ● Data size ● Model learning
  • 6. Method #1 : CMAP Polysemy : Clustering Irrelevancy : Outlier detection+ Rectifying Self Organizing Map (RSOM) Accepted for Draft version : http://arxiv.org/abs/1312.4384
  • 7. RSOM ● Very Generic method for other domains as well (textual, biological etc.) ● Extension of SOM (a.k.a. Kohonen's Map) * ● Inspired by biological phenomenas ** ● Able to cluster data and detect outliers ● IRRELEVANCY SOLVED!! *Kohonen, T.: Self-organizing maps. Springer (1997) **Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology 160(1) (1962) 106 Outlier clusters Outlier instances in salient clusters
  • 8. RSOM cont' finding outlier units ● Look activation statistics of each SOM unit in learning phase ● Latter learning iterations are more reliable IF a unit is activated REARLY → OUTLIER FREQUENTLY → SALIENT Winner activations Neighbor activations
  • 9. RSOM cont' finding sole outliers x x x x
  • 10. Learning Models ● Learn L1 linear SVM models – Easier to train – Better for high dimensional data (wide data matrix) – Implicit feature selection by L1 norm ● Learn one linear model from each salient cluster ● Each concept has multiple models – POLYSEMY SOLVED!!
  • 11. CMAP Overview
  • 12. Retrospective ● Fergus et. al. [1] – They use human annotated control set to cull data – We use fully non-human afforded data ● Berg and Forsyth [3] – They use textual surrounding – We use only visual content ● OPTIMOL, Li and Fei-Fei [2] – They use seed images and update incrementally – We use no supervision with all in one iteration ● Efros et. al. [4] “Discriminative Patches” – They require a large computer clusters and iterative data elimination – We use single computer with faster and better results and no time wasting iterations. ● CMAP has broader possible applications [1] Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Computer Vision, 2005. ICCV 2005 [2] Berg, T.L., Berg, A.C., Edwards, J., Maire, M., White, R., Teh, Y.W., Learned-Miller, E.G., Forsyth, D.A.: Names and faces in the news. In: IEEE Conference on Computer Vision Pattern Recognition (CVPR). Volume 2. (2004) 848–854 [3] Li, L.J., Fei-Fei, L.: Optimol: automatic online picture collection via incremental model learning. International journal of computer vision 88(2) (2010) 147–168 [4] Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Computer Vision–ECCV 2012. Springer (2012) 73–86
  • 13. Experiments ● Only use images for learning ● Attack to problems: – Attribute Learning : [1] , Images, Google [2], [2] ● Learn Texture and Color attributes – Scene Learning : MIT-indoor [4], Scene-15 [5] ● Use Attributes as mid-level features – Face Recognition : FAN-Large [6] ● Use EASY and HARD subset of the dataset – Object Recognition : Google data-set [3] [1] Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: Trends and Topics in Computer Vision. Springer (2012) [2] Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. Image Processing, IEEE (2009) [3] Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Computer Vision, 2005. ICCV 2005 [4] Quattoni, A., Torralba, A.: Recognizing indoor scenes. CVPR (2009) [5] Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR 2006 [6] Ozcan, M., Luo, J., Ferrari, V., Caputo, B.: A large-scale database of images and captions for automatic face naming. In: BMVC. (2011)
  • 14. Visual Examples
  • 15. Visual Examples # Faces Salient Clusters Outlier Clusters Outlier Instances
  • 16. Salient Clusters Outlier Clusters Outlier Instances
  • 17. Implementation ● Visual Features : – BoW SIFT with 4000 words (for texture attribute, object and face) – Use 3D 10x20x20 Lab Histograms (for attribute) – 256 dimensional LBP [1] (for object and face) ● Preprocessing – Attribute: Extract random 100x100 non-overlapping image patches from each image. – Scene: Represent each image with the confidence scores of attribute classifiers in a Spatial Pyramid sense – Face: Apply face detection[2] to each image and get one highest score patch. – Object: Apply unsupervised saliency detection [3] to images and get a single highest activation region. ● Model Learning – Use outliers and some sample of other concept instances as Negative set – Apply Hard Mining – Tune all hyper parameters via X-validation on the (classifiers and RSOM parameters) ● NOTICE: – We use Google images to train concept models and deal with DOMAIN ADAPTATION [1] Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on 24(7) (2002) 971–987 [2] Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, IEEE (2012) 2879–2886 [3] Erdem, E., Erdem, A.: Visual saliency estimation by nonlinearly integrating features using region covariances. Journal of Vision 13(4) (2013) 1–20
  • 18. Results Ours State of art Face 0.66 0.58 [1] Object 0.78 0.75 [2] Attribute Image-Net 0.37 0.36 [3] Attribute ebay 0.81 0.79 [4] Attribute bing 0.82 - - We beat all state of art methods except scene recognition!! However our method is very cheaper compared to Li et al. [5] [1] Ozcan, M., Luo, J., Ferrari, V., Caputo, B.: A large-scale database of images and captions for automatic face naming. BMVC. (2011) [2] Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Computer Vision, 2005. ICCV 2005 [3] Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: Trends and Topics in Computer Vision. Springer (2012) [4] Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. Image Processing, IEEE (2009) [5] Li, Q., Wu, J., Tu, Z.: Harvesting mid-level visual concepts from large-scale internet images. CVPR (2013)
  • 19. Last Words ● Fact – We propose a novel algorithm RSOM ● Fact – Roughly beating all state-of-art methods ● Fact – Solution for better data-sets with little or no human effort ● Improvement – Try to estimate # clusters implicitly without any hyper parameter. ● Improvement – Use more complex classification scheme.
  • 20. Not Much... Thanks for valuable time :)

×