Remote Sensing Scene Classification by Unsupervised Representation Learning
1. REMOTE SENSING SCENE
CLASSIFICATION BY UNSUPERVISED
REPRESENTATION LEARNING
By
Ansari Mohammed Atif Sohel
Under the guidance of
Dr.M.B.Kokare
2. CONTENTS
■ Motivation
■ Literature Survey
■ Proposed Method
■ Algorithm
■ SPM Feature Aggregation
■ Data Sets
■ Experimental Setup
■ Effectiveness of Weights
■ Effectiveness of SPM
■ Results
■ Conclusion
■ References
3. MOTIVATION
■ With the rapid development of the satellite sensor technology, high spatial
resolution remote sensing (HSR) data have attracted extensive attention
in military and civilian applications.
■ In order to make full use of these data, remote sensing scene classification
becomes an important and necessary precedent task.
■ For the sake of classifying these scenes, effectively recognizing and
describing the scenes of interest is a vital and challenging task.
4. LITRATURE SURVEY
To recognize and describe the scenes from HSR images, manymethods have
been proposed in recent years.
■ A Bottom–up scheme to model an HSR image scene
■ Bag-of-Visual Words (BoVW) Model
■ Probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet
Allocation (LDA)
■ SIFT & HOG
10. SPM FEATURE AGGREGATION
■ SPM is utilized to aggregate features at different scales based on the
feature maps, because of its ability of spatial–structural information
retentivity.
■ We separate the feature maps for each image into a set of patches with
size of 3×3 forming several 3-D cubes with the size of 3×3×K, where K
is the number of the feature maps.
■ Then normalize these 3-D cubes into vectors, which are clustered by
sparse code forming 50 cluster centers in the proposed method.
■ Then, for each image, the feature maps are encoded by these centers at
multiscale.
11. DATA SETS
■ UCMerced Data Set:
This data set contains 21 classes of aerial scenes. Each aerial scene
composed of 100 images. These images have a resolution of 1 ft./pixel and a
size of 256×256.
12. DATA SETS
■ Sydney Data Set:
The large image is of 18000 × 14000 pixels with a spatial resolution of 0.5
m, as shown in Figure.
The original image is converted into 500×500 pixel sub images, where each
sub image is supposed to contain one certain scene.
And then, 613 images containing seven categories of distinct scenes are
extracted and labeled.
14. EXPERIMENTAL SETUP
■ In our experiment, all images are
converted into gray scale.
■ The number of feature maps K is set to
15 and the size of the filter is fixed to be
7×7 .
■ Parameter β, the size of the gradient
step of updating z, is set to be 0.001.
■ Parameter λ use to balances the weights
between the reconstruction term and the
regularization term.
■ For SVM:
BestC=200, BestG=2 & BestCV=0
■ Iteration time T=10 & ISTA steps M=5
15. EFFECTIVENESS OF WEIGHTS
■ The feature map corresponding to the third filter in row 1 contains more
information than the feature map corresponding to the fourth filter in row
1.
■ Hence, the third filter in row 1 should be assigned high weights than the
fourth filter in row 1 to balance their contributions.
19. CONCLUSION
■ HSR scenes cover a much wider ground surface and the object
distribution is very complex.
■ The proposed unsupervised feature learning alleviates the classifying
difficulties brought by the interclass similarities and complex
distributions in HSR images.
■ The proposed method obtains a precise and discriminative representation
by combining the weighted deconvolution model and SPM.
■ According to the experimental results, the proposed method achieves the
considerable performance compared with the state of the arts, which
demonstrates the effectiveness and time efficiency of the proposed
method.
20. REFERENCES
■ S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial
pyramid matching for recognizing natural scene categories,” in Proc.
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2. 2006,
pp. 2169–2178.
■ Y. Yang and S. Newsam, “Spatial pyramid co-occurrence for image
classification,” in Proc. IEEE Int. Conf. Comput. Vis., Nov. 2011, pp.
1465–1472.
■ M. Lienou, H. Maitre, and M. Datcu, “Semantic annotation of satellite
images using latent Dirichlet allocation,” IEEE Geosci. Remote Sens.
Lett., vol. 7, no. 1, pp. 28–32, Jan. 2010.
■ T. Hofmann, “Unsupervised learning by probabilistic latent semantic
analysis,” Mach. Learn., vol. 42, no. 1, pp. 177–196, Jan. 2001.
21. Continue..
■ M. D. Zeiler, G. W. Taylor, and R. Fergus, “Adaptive deconvolutional
networks for mid and high level feature learning,” in Proc. IEEE Int.
Conf. Comput. Vis., Nov. 2011, pp. 2018–2025.
■ M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional
networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun.
2010, pp. 2528–2535.
■ A. M. Cheriyadat, “Unsupervised feature learning for aerial scene
classification,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 1, pp.
439–451, Jan. 2014.
■ [42] F. Hu, G.-S. Xia, Z. Wang, X. Huang, L. Zhang, and H. Sun,
“Unsupervised feature learning via spectral clustering of
multidimensional patches for remotely sensed scene classification,” IEEE
J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 5, pp. 2015–
2030, May 2015.