Remote Sensing Scene Classification by Unsupervised Representation Learning

REMOTE SENSING SCENE
CLASSIFICATION BY UNSUPERVISED
REPRESENTATION LEARNING
By
Ansari Mohammed Atif Sohel
Under the guidance of
Dr.M.B.Kokare

CONTENTS
■ Motivation
■ Literature Survey
■ Proposed Method
■ Algorithm
■ SPM Feature Aggregation
■ Data Sets
■ Experimental Setup
■ Effectiveness of Weights
■ Effectiveness of SPM
■ Results
■ Conclusion
■ References

MOTIVATION
■ With the rapid development of the satellite sensor technology, high spatial
resolution remote sensing (HSR) data have attracted extensive attention
in military and civilian applications.
■ In order to make full use of these data, remote sensing scene classiﬁcation
becomes an important and necessary precedent task.
■ For the sake of classifying these scenes, effectively recognizing and
describing the scenes of interest is a vital and challenging task.

LITRATURE SURVEY
To recognize and describe the scenes from HSR images, manymethods have
been proposed in recent years.
■ A Bottom–up scheme to model an HSR image scene
■ Bag-of-Visual Words (BoVW) Model
■ Probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet
Allocation (LDA)
■ SIFT & HOG

SPM FEATURE AGGREGATION
■ SPM is utilized to aggregate features at different scales based on the
feature maps, because of its ability of spatial–structural information
retentivity.
■ We separate the feature maps for each image into a set of patches with
size of 3×3 forming several 3-D cubes with the size of 3×3×K, where K
is the number of the feature maps.
■ Then normalize these 3-D cubes into vectors, which are clustered by
sparse code forming 50 cluster centers in the proposed method.
■ Then, for each image, the feature maps are encoded by these centers at
multiscale.

DATA SETS
■ UCMerced Data Set:
This data set contains 21 classes of aerial scenes. Each aerial scene
composed of 100 images. These images have a resolution of 1 ft./pixel and a
size of 256×256.

DATA SETS
■ Sydney Data Set:
The large image is of 18000 × 14000 pixels with a spatial resolution of 0.5
m, as shown in Figure.
The original image is converted into 500×500 pixel sub images, where each
sub image is supposed to contain one certain scene.
And then, 613 images containing seven categories of distinct scenes are
extracted and labeled.

EXPERIMENTAL SETUP
■ In our experiment, all images are
converted into gray scale.
■ The number of feature maps K is set to
15 and the size of the ﬁlter is ﬁxed to be
7×7 .
■ Parameter β, the size of the gradient
step of updating z, is set to be 0.001.
■ Parameter λ use to balances the weights
between the reconstruction term and the
regularization term.
■ For SVM:
BestC=200, BestG=2 & BestCV=0
■ Iteration time T=10 & ISTA steps M=5

EFFECTIVENESS OF WEIGHTS
■ The feature map corresponding to the third filter in row 1 contains more
information than the feature map corresponding to the fourth filter in row
1.
■ Hence, the third filter in row 1 should be assigned high weights than the
fourth filter in row 1 to balance their contributions.

EFFECTIVENESS OF WEIGHTS & SPM

RESULTS
■ On UCMerced Data Set

RESULTS
■ On Sydney Data Set

CONCLUSION
■ HSR scenes cover a much wider ground surface and the object
distribution is very complex.
■ The proposed unsupervised feature learning alleviates the classifying
difﬁculties brought by the interclass similarities and complex
distributions in HSR images.
■ The proposed method obtains a precise and discriminative representation
by combining the weighted deconvolution model and SPM.
■ According to the experimental results, the proposed method achieves the
considerable performance compared with the state of the arts, which
demonstrates the effectiveness and time efﬁciency of the proposed
method.

REFERENCES
■ S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial
pyramid matching for recognizing natural scene categories,” in Proc.
IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2. 2006,
pp. 2169–2178.
■ Y. Yang and S. Newsam, “Spatial pyramid co-occurrence for image
classiﬁcation,” in Proc. IEEE Int. Conf. Comput. Vis., Nov. 2011, pp.
1465–1472.
■ M. Lienou, H. Maitre, and M. Datcu, “Semantic annotation of satellite
images using latent Dirichlet allocation,” IEEE Geosci. Remote Sens.
Lett., vol. 7, no. 1, pp. 28–32, Jan. 2010.
■ T. Hofmann, “Unsupervised learning by probabilistic latent semantic
analysis,” Mach. Learn., vol. 42, no. 1, pp. 177–196, Jan. 2001.

Continue..
■ M. D. Zeiler, G. W. Taylor, and R. Fergus, “Adaptive deconvolutional
networks for mid and high level feature learning,” in Proc. IEEE Int.
Conf. Comput. Vis., Nov. 2011, pp. 2018–2025.
■ M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional
networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun.
2010, pp. 2528–2535.
■ A. M. Cheriyadat, “Unsupervised feature learning for aerial scene
classiﬁcation,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 1, pp.
439–451, Jan. 2014.
■ [42] F. Hu, G.-S. Xia, Z. Wang, X. Huang, L. Zhang, and H. Sun,
“Unsupervised feature learning via spectral clustering of
multidimensional patches for remotely sensed scene classiﬁcation,” IEEE
J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 5, pp. 2015–
2030, May 2015.

Remote Sensing Scene Classification by Unsupervised Representation Learning

Remote Sensing Scene Classification by Unsupervised Representation Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Remote Sensing Scene Classification by Unsupervised Representation Learning

Similar to Remote Sensing Scene Classification by Unsupervised Representation Learning (20)

Recently uploaded

Recently uploaded (20)

Remote Sensing Scene Classification by Unsupervised Representation Learning