• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Learning a structured model for visual category recognition
 

Learning a structured model for visual category recognition

on

  • 211 views

Learning a Structured Model for Visual Category Recognition ...

Learning a Structured Model for Visual Category Recognition

Abstract:
This thesis deals with the problem of estimating structure in data due to the semantic relations between data elements and leveraging this information to learn a visual model for category recognition. A visual model consists of dictionary learning, which computes a succinct representation of training data by partitioning feature space, and feature encoding, which learns a representation of each image as a combination of dictionary elements. Besides variations in lighting and pose, a key challenge of classifying a category is intra-category appearance variation. The key idea in this thesis is that feature data describing a category has latent structure due to visual content idiomatic to a category. However, popular algorithms in literature disregard this structure when computing a visual model.

Towards incorporating this structure in the learning algorithms, this thesis analyses two facets of feature data to discover relevant structure. The first is structure amongst the sub-spaces of the feature descriptor. Several sub-space embedding techniques that use global or local information to compute a projection function are analysed. A novel entropy based measure of structure in the embedded descriptors suggests that relevant structure has local extent. The second is structure amongst the partitions of feature space. Hard partitioning of feature space leads to ambiguity in feature encoding. To address this issue, novel fuzzy logic based dictionary learning and feature encoding algorithms are employed that are able to model the local feature vectors distributions and provide performance benefits.

To estimate structure amongst sub-spaces, co-clustering is used with a training descriptor data matrix to compute groups of sub-spaces. A dictionary learnt on feature vectors embedded in these multiple sub-manifolds is demonstrated to model data better than a dictionary learnt on feature vectors embedded in a single sub-manifold computed using principal components. In a similar manner, co-clustering is used with encoded feature data matrix to compute groups of dictionary elements - referred to as `topics'. A topic dictionary is demonstrated to perform better than a regular dictionary of comparable size. Both these results suggest that the groups of sub-spaces and dictionary elements have semantic relevance.

All the methods developed here have been viewed from the unifying perspective of matrix factorization, where a data matrix is decomposed to two factor matrices which are interpreted as a dictionary matrix and a co-efficient matrix. Sparse coding methods, which are currently enjoying much success, can be viewed as matrix factorization with a regularization constraint on the vectors of the dictionary or co-efficient matrices. ....

Statistics

Views

Total Views
211
Views on SlideShare
211
Embed Views
0

Actions

Likes
0
Downloads
3
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Learning a structured model for visual category recognition Learning a structured model for visual category recognition Presentation Transcript

    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Learning A Structured Model For Visual Category Recognition Ashish Gupta University of Surrey a.gupta@surrey.ac.uk July 5,2013 Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Introduction Introduction : What is Category Recognition? Feature vector Embedding : Information in Sub-Manifold. Feature vector distribution: Fuzzy Visual Model. Estimating semantic structure: Co-clustering. Sparse Models: Semantically structured. Summary & Future Work Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Motivation Visual Category? Robot interacts physical objects. Object taxonomy based on physical properties. Robot recognizes object using visual appearance. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Motivation Visual Category Model Appearance variation → scatter of semantically related descriptors in feature space Can this scatter distribution be estimated? Can this structure be used to improve the learnt visual model? Visual category model ≈ Visual object model + Estimated structure of visual category variation Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Approach Visual Classification Pipeline Structure in sub-spaces → groups of sub-spaces → dictionary Structure in dictionary → groups of prototypes → encoding Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Approach Feature Descriptor Matrix Scene−15 D−SIFT, 500 feature vectors of 128 dimensions feature vectors dimensions 0 50 100 150 200 250 Matrix of 500 D-SIFT feature descriptors, each of 128 dimensions. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Approach Encoded Feature Matrix Conceptual illustration of encoded feature matrix, occurrence histogram of visual words in images. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Approach Conceptual Interpretation Structure estimation can be interpreted as estimation of semantically related rows or columns of data matrix. These are projected to a lower dimensional space such that mutual separation between equivalent feature vectors is reduced. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Sub-space Embedding Feature descriptor space is high dimensional. Relevant information is embedded in a lower dimensional sub-manifold. What is the appropriate lower dimensionality? Measure efficacy of sub-space embedding method? Measure information in embedded feature vectors. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Intrinsic Dimensionality Intrinsic dimensionality p estimation Correlation Dimension Number of feature vectors in a hypersphere of radius r is proportional to rp . Maximum Likelihood Estimate Expectation of number of feature vectors covered by a hypersphere of growing radius r. Eigenvalue Estimate Number of eigenvalues greater than a small threshold value . Geodesic Minimum Spanning Tree Based on length of GMST of k descriptors in a neighbourhood graph. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Intrinsic Dimensionality Estimated Intrinsic Dimensionality Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Intrinsic Dimensionality Subspace Embedding Methods Global Methods Principal Components Multi-Dimensional Scaling Stochastic Proximity Embedding Isomap Diffusion Maps Local Methods Locally Linear Embedding Locality Preserving Projection Neighbourhood Preserving Projection Landmark Isomap t-Stochastic Neighbourhood Embedding Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Entropic Measure Entropy Measure Intuition −10 −5 0 5 10 15 0 20 40 −15 −10 −5 0 5 10 15 x ’swiss’ synthetic data Y Z −1.5 −1 −0.5 0 0.5 1 1.5 −1 −0.5 0 0.5 1 −5 0 5 10 X ’intersect’ synthetic data Y Z −400 −200 0 200 400 −500 0 500 −300 −200 −100 0 100 200 X ’VOC2006,car’ data Y Z 0 10 20 30 40 50 60 70 80 90 100 0 0.005 0.01 0.015 0.02 0.025 Bin index NormalizedFrequency Distribution of pair−wise distances in data swiss, H=−25.3355 intersect, H=−19.3150 VOC2006,car, H=−33.0302 Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Comparison of Embedded Entropy Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Computational Time Complexity Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Classification Performance Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Conclusion Estimated intrinsic dimensionality was in the neighbourhood of 14 of the 128-dimensional descriptor. The performance of LPP in comparison to other embedding methods accentuates the importance of modelling structure in local distributions. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Fuzzy Visual Model Structure in distribution of descriptors in feature space? Issues with K-means clustering in the Bag-of-Words model. Visual model incorporating Fuzzy logic framework. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Visual Ambiguity Descriptor assignment has issues of uncertainty and plausibility. Kernel Codebook uses soft-assignment to resolve the ambiguity. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Fuzzy Models Visual Dictionary 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 times (normalized scale) acceleration(normalizedscale) K−means Hard Partition | Motorcycle Data 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 times (normalized scale) acceleration(normalizedscale) Fuzzy K−Means Partition | Motorcycle Data 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 times (normalized scale) acceleration(normalizedscale) Gustafson−Kessel Fuzzy Partition | Motorcycle Data L(Z; µC) = r j=1 i∈Cj zi − µCj 2 L(Z; D, A) = r i=1 n j=1 (αij )m zj − µCi 2 Σ L(Z; D, A, {Σi }) = r i=1 n j=1 (αij )m zj − di 2 Σi Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Fuzzy Models d2 Σ(z, µC) = (z−µC)T Σ(z−µC) Σ =       ( 1 σ1 )2 0 · · · 0 0 ( 1 σ2 )2 · · · 0 ... ... ... ... 0 0 · · · ( 1 σn )p       d2 Σi (zj , µCi ) = (zj −µCi )T Σi (zj −µCi ) Fi = n j=1(αij )m (zj − di )(zj − di )T n j=1(αij )m Σi = (ρi det(Fi )) 1 p Fi Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results FKM Classification Performance MITcoast MITmountain industrial livingroom MITopencountry PARoffice MITtallbuilding CALsuburbstore bedroom MITforest MIThighway MITstreet MITinsidecity kitchen visual category 0.5 0.6 0.7 0.8 Acc Scene15 Bag-of-Words Fuzzy K-means sheep horse bicycle motorbike cow bus dog cat person car visual category 0.45 0.50 0.55 0.60 Acc VOC2006 Bag-of-Words Fuzzy K-means Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results GK Classification Performance MITcoast MITmountain industrial livingroom MITopencountry PARoffice MITtallbuilding CALsuburbstore bedroom MITforest MIThighway MITstreet MITinsidecity kitchen visual category 0.5 0.6 0.7 0.8 Acc Scene15 Bag-of-Words Gustafson-Kessel sheep horse bicycle motorbike cow bus dog cat person car visual category 0.45 0.50 0.55 0.60 Acc VOC2006 Bag-of-Words Gustafson-Kessel Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Dictionary Size 32 64 128 256 512 dictionary size 0.58 0.60 0.62 0.64 0.66 Acc Caltech101 Bag-of-Words Fuzzy K-means 32 64 128 256 512 dictionary size 0.58 0.60 0.62 0.64 0.66 Acc Caltech101 Bag-of-Words Gustafson-Kessel Comparison of BoW with FKM and GK for different sizes of dictionary. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Aggregate Performance VOC2006 VOC2010 data set 0.50 0.51 0.52 0.53 0.54 0.55 Acc Bag-of-Words Fuzzy K-means Gustafson-Kessel (a) VOC datasets Caltech101 Caltech256 data set 0.60 0.62 0.64 0.66 0.68 Acc Bag-of-Words Fuzzy K-means Gustafson-Kessel (b) Caltech datasets Visual Model Data Set VOC-2006 VOC-2010 Caltech-101 Caltech-256 BoW 0.50825 0.52446 0.60111 0.67606 FKM 0.52635 0.53736 0.61928 0.68357 G-K 0.52885 0.54224 0.62413 0.68623 Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Empirical Results Conclusion Visual model learnt within the framework of fuzzy logic adapts to the local distribution of feature vectors. Learning a better fuzzy membership function is an effective alternative to learning increasing large dictionaries to adapt to increasing complexity of visual categories. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Co-clustering for Structure Estimation What is co-clustering? Co-clustering for structure in descriptor data matrix. Co-clustering for structure in encoded feature matrix. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Co-clustering Methods Co-clustering Co-clustering is simultaneous and alternative row and column clustering of a data matrix. At each step of the optimization routine, the groups of rows guide column clustering and vice versa. CX : {x1, . . . , xm} → {ˆx1, . . . , ˆxk} CY : {y1, . . . , yn} → {ˆy1, . . . , ˆyl } Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Co-clustering Methods Co-clustering methods Information-Theoretic Co-Clustering Data matrix is considered a joint probability distribution. Minimizes KL-divergence between original data and co-clustered matrices. Sum-Squared Residue Co-Clustering Alternative k-means clustering of rows and columns. Minimizes squared Euclidean distance between rows and columns from row and column means respectively. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Co-clustering Methods Information-Theoretic Co-clustering I(X; Y ) − I( ˆX; ˆY ) = dKL(p(X, Y ), q(X, Y )) Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Multiple Sub-spaces Mutiple Sub-spaces Intuition i,j dE (z• i|Sl , z• j|Sq ) > i,j dE (z• i , z• j ), l = q Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Multiple Sub-spaces Co-clustering descriptor data matrix Scene−15 D−SIFT, 500 feature vectors of 128 dimensions feature vectors dimensions 0 50 100 150 200 250 Information−Theoretic Co−Clustering of Scene−15 D−SIFT 500x128 into 10 row and 10 column clusters feature vectors dimensions 0 50 100 150 200 250 Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Multiple Sub-spaces Dictionary on single and multiple sub-spaces Universal PCA Dictionary : VOC−2006 : D−SIFT : 10 x 500 : PCA + Kmeans dictionary [500] dimensions[10]PCA 0 100 200 Universal CC Dictionary : VOC−2006 : D−SIFT : 10 x 500 : SSRCC + Kmeans dictionary [500] dimensions[10]CC 0 100 200 Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Multiple Sub-spaces Classification performance VOC2006 VOC2007 Data Set 0.50 0.55 0.60 0.65 0.70 F1 Dict: 10x1000 MSSD:(i): 5x1000 MSSD:(r): 5x1000 VOC2006 VOC2007 Data Set 0.50 0.55 0.60 0.65 F1 Dict: 10x1000 MSSD:(i): 10x1000 MSSD:(r): 10x1000 Comparison of classification performance of single and multiple sub-space dictionaries. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Multiple Sub-spaces Dictionary projected to multiple sub-spaces Universal Dictionary : VOC−2006 : D−SIFT : 128x500 : Kmeans dictionary [500] dimensions[128] 0 50 100 150 200 250 Universal Submanifold Dictionary : VOC−2006 : D−SIFT : 128 (10) x 500 : SSRCC + Kmeans dictionary [500] dimensions[128],submanifolds[10] 0 50 100 150 200 250 Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Multiple Sub-spaces Classification performance VOC2006 VOC2007 Data Set 0.50 0.55 0.60 0.65 F1(5) Dict: 128x1000 SSSD:(i): 128x1000 SSSD:(r): 128x1000 VOC2006 VOC2007 Data Set 0.50 0.55 0.60 0.65 0.70 F1(50) Dict: 128x1000 SSSD:(i): 128x1000 SSSD:(r): 128x1000 Comparison of classification performance of dictionary projected to multiple sub-spaces. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Topic Dictionary Structure in Dictionary Intuition Estimating groups of non-contiguous partitions of feature space that are semantically related. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Topic Dictionary Topic Dictionary Concept Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Topic Dictionary Classification Performance Comparison of classification performance of dictionaries using BoW and ITCC, for VOC2006 and Scene15 datasets. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Topic Dictionary Dictionary sizes VOC2006 VOC2007 VOC2010 Scene15 Caltech101 Data Set 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 F1 BoW: 100 CC:i: 100 VOC2006 VOC2007 VOC2010 Scene15 Caltech101 Data Set 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 F1 BoW: 500 CC:i: 500 VOC2006 VOC2007 VOC2010 Scene15 Caltech101 Data Set 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 F1 BoW: 1000 CC:i: 1000 Comparative classification performance for different dictionary sizes. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Topic Dictionary Conclusion Groups of sub-spaces computed using co-clustering yielded dictionaries with better classification performance. Groups of feature space partition (dictionary elements) yielded improved classification results. These estimated groups can be used in learning a semantically structured visual model. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Sparse Decomposition Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Sparse Visual Model Sparse model approximates a feature vector as a combination of a sub-set of an over-complete basis set. Sparsity is induced by adding a regularization constraint is added to the coefficients in the loss function. Degree of sparsity is determined empirically. Each basis element is considered individually. Possible structure amongst basis elements is disregarded. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sparse Model SSPCA (structure in sub-spaces) Co-clustered groups of sub-spaces is used to augment Sparse-PCA to compute Structured Sparse-PCA dictionary. Group Lasso (structure in dictionary) Co-clustered groups of dictionary elements is used to augment Lasso to compute group Lasso feature encoding. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Sparse Regularization Sparse regularization : min α 1 n n i=1 L(zi , dαi ) + λΩ(α) Lasso : min α 1 n n i=1 zi − Dαi 2 +λ αi 1 Group Sparsity : min α 1 n n i=1 zi − Dαi 2 +λ k j=1 αi Gj Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sub-space Structured Sub-space Dictionary using ITCC sheep horse bicycle motorbike cow bus dog cat person car Visual Category 50 60 70 80 90 mAP VOC2006 Sparse Subspace Structured Subspace sheep horse bicycle aeroplanecow sofabus dog cat person train diningtable bottlecar pottedplant tvmonitor chairbird boat motorbike Visual Category 50 60 70 80 90 mAP VOC2007 Sparse Subspace Structured Subspace Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sub-space Structured Sub-space Dictionary using SSRCC sheep horse bicycle motorbike cow bus dog cat person car Visual Category 60 70 80 90 mAP VOC2006 Sparse Subspace Structured Subspace sheep horse bicycle aeroplanecow sofabus dog cat person train diningtable bottlecar pottedplant tvmonitor chairbird boat motorbike Visual Category 50 60 70 80 90 mAP VOC2007 Sparse Subspace Structured Subspace Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sub-space Sparse Subspace Structured Sparse Subspace Data Set ITCC SSRCC VOC2006 67.5941 70.8295 68.5808 VOC2007 67.9971 68.0783 68.3718 Sparse selection of semantically related set of sub-spaces performs better than sparse individual selection of sub-spaces. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sparse Dictionary Structured Sparse Encoding using ITCC MITcoast MITmountain industrial livingroom MITopencountry PARoffice MITtallbuilding CALsuburb store bedroom MITforest MIThighway MITstreet MITinsidecity kitchen Visual Category 50 60 70 80 90 mAP Scene15 ITCC Sparse Encoding Structured Encoding sheep horse bicycle motorbike cow bus dog cat person car Visual Category 60 70 80 90 100 mAP VOC2006 ITCC Sparse Encoding Structured Encoding Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sparse Dictionary Structured Sparse Encoding using SSRCC MITcoast MITmountain industrial livingroom MITopencountry PARoffice MITtallbuilding CALsuburb store bedroom MITforest MIThighway MITstreet MITinsidecity kitchen Visual Category 50 55 60 65 70 75 80 85 mAP Scene15 SSRCC Sparse Encoding Structured Encoding sheep horse bicycle motorbike cow bus dog cat person car Visual Category 60 70 80 90 100 mAP VOC2006 SSRCC Sparse Encoding Structured Encoding Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Structured Sparse Dictionary Sparse Encoding Structured Sparse Encoding Data Set ITCC SSRCC VOC-2006 72.8386 73.3977 72.7738 Scene-15 68.5737 79.8794 72.1155 Sparse selection of semantically related set of dictionary elements performs better than sparse individual selection of dictionary element. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Summary Learning semantically relevant structure in feature space used to compute better visual models. Analysis of sub-space embedding emphasized modelling local distributions. Incorporation of fuzzy logic framework to learn dictionary kernels that adapt to local distributions yielded better visual models. Co-clustering was successful in grouping semantically related sub-spaces and feature space partitions. Estimated groups of sub-spaces and dictionary elements were used to compute structured sparse visual models, improving upon regular sparse models. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Future Work Future Work Visual models using Fisher Kernel coding, which uses a Gaussian kernel, has been very successful. Combining the approach in Fisher Kernels with the learnt Fuzzy membership functions could potentially improve the visual model. Fuzzy logic based learning algorithms that are more advanced than Gustafson-Kessel could be explored to learn better membership functions. Co-clustering creates a block factorization of the data matrix. Partial membership of rows and columns to the co-clusters would be the natural next step. Explore ways of using semantic structure to improve feature generation techniques like hierarchical models that aim to learn category specific descriptors. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Future Work End Questions... Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Appendices BoW Partitioning 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x y Bag−of−Words Partition | VOC−2006 | #000017 Figure: Bag-of-Words model and image ‘000017’ in VOC-2006 dataset. The dictionary of size 25 ( ) is computed using K-means clustering. The feature vectors ( ) are projected to 2 dimensions using PCA. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Appendices FKM Partitioning 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x y Fuzzy K−means Fuzzy Partition | VOC−2006 | #000017 Figure: Fuzzy K-means model and image ‘000017’ in VOC-2006 dataset. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition
    • Introduction Sub-space Embedding Fuzzy Visual Model Structure Estimation Structured Sparse Model Summary Appendices GK Partitioning 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x y Gustafson−Kessel Fuzzy Partition | VOC−2006 | #000017 Figure: Gustafson-Kessel model and image ‘000017’ in VOC-2006 dataset. Ashish Gupta University of Surrey Learning A Structured Model For Visual Category Recognition