This document discusses a method for detecting forest change using deep neural networks on incomplete satellite images. It proposes using a three-stage process to recover missing and cloudy data by exploiting temporal redundancy. A deep CNN is then used to classify candidate image regions as changed or unchanged based on spatial and temporal contextual information from multiple time periods. The method is able to precisely localize and time stamp detected changes within a few months. Evaluation on Australian satellite imagery shows the approach can effectively detect fires and harvest areas.
Forest Change Detection Using Deep Neural Networks
1. FOREST CHANGE DETECTION IN
INCOMPLETE SATELLITE IMAGES WITH
DEEP NEURAL NETWORKS
By
Ansari Mohammed Atif Sohel
Under the guidance of
Dr.M.B.Kokare
2. CONTENTS
■ Motivation
■ Literature Survey
■ Proposed Method
■ Study Area
■ Data Recovery
■ Change Detection
■ Deep Neural Network
■ Experimental Settings
■ Results
■ Conclusion
■ Future Scope
3. MOTIVATION
■ Land cover change monitoring is an important task from the perspective
of regional resource monitoring, disaster management, land development,
and environmental planning.
■ Forest change detection is crucial for continuous environmental
monitoring to closely investigate natural resource depletion, biodiversity
loss, and deforestation.
■ During 2006–2007 to 2010–2011, an area of approximately 39 million
hectares was destroyed by fires and 9000 hectares were yearly harvested
in Australia.
4. LITRATURE SURVEY
The prevalent approaches for change detection in remotely sensed data can
be categorized into two major classes:
■ Low Level Local Approaches: The low-level approaches use statistical
indices derived from the pixel values of spectral images. They are limited
to pixel level analysis, and thus, they remain agnostic to the valuable
contextual information.
■ Object-based Approaches: The object-based approaches consider the
contextual information by working on the homogeneous pixels, which are
usually grouped together based on their appearance (spectral
information), location, and/or temporal properties.
6. STUDY AREA
We analysed a 222.4×90 km
2
rectangular area in the northeast of Melbourne, VIC, Australia.
The remote sensing satellite data are provided by the Australian Reflectance Grid (ARG)
from the Geoscience Australia. ARG is a medium resolution (0.000250 ∼ = 25 m) grid of
surface reflectance data based on U. S. Geological Survey’s Landsat TM/ETM+ imagery.
7. DATA RECOVERY
■ The data under investigation contain several artifacts due to which land
cover is not always visible in the ARG.
■ These artifacts include missing surface reflectance data, heavy clouds,
and saturated channels in remotely sensed data.
■ Moreover, black stripes (wedge-shaped gaps) appear in the Landsat-7
ETM+ imagery due to the failure of the scan line corrector (SLC) in
2003.
9. DATA RECOVERY STAGES
To fill in the missing data and the residual cloudy regions, we design a three-
stage image completion process that exploits the redundancy in the raw
image data.
■ Gap Filling
It deals with large gaps by assessing the reliability of data
along the temporal domain.
■ Masked Sparse Reconstruction
It performs a spatial refinement to remove noisy data and
ensure spatial consistency.
■ Thin Cloud Removal
It performs further refinement by removing very thin and
transparent clouds.
10. GAP FILLING
In the first stage, we fuse the reliable data along the temporal dimension to
generate one representative image for a period of approximately two months
using the corresponding flags in the available pixel quality map.
Then, we construct a mean image from the representative images to obtain a
yearly background profile, which we employ consecutively to fill the
remaining missing pixels in the original images.
11. MASKED SPARSE RECONSTRUCTION
In the second stage, we further enhance the image frames using masked
sparse reconstruction to enforce the spatial consistency and remove possible
artifacts generated from the first stage.
Given a set of input images {𝐼}1×𝑁 , we first extract the same size
overlapping patches with dimensions s×s and a uniform step of p.
These patches form a set 𝑃 = {𝑝𝑖}𝑖=1
𝑀
,where normally M is a
considerably large number.
To make the dictionary learning step computationally feasible, we randomly
choose a relatively smaller set of patches denoted by
𝑃 = {𝑝𝑖}𝑖=1
𝑚
.
Typically, the learned dictionary is composed of r basis vectors, where r<<m.
12. Continue..
The objective minimized during the dictionary learning process is defined as
follows:
min
𝐷∈∁
1
𝑚 𝑖=1
𝑚
min
𝛼 𝑖 𝜖𝑅 𝑟
(
1
2
𝑝𝑖 − 𝐷𝛼𝑖 2
2
)+λ 𝛼𝑖 1 + 𝛾 𝛼𝑖 2
2
……(1)
where λ and γ are the regularization parameters which enforce a sparse
solution for 𝛼𝑖. The set ∁ is the constraint set of matrices defined as follows:
∁= {𝐷 ∈ 𝑅 𝑞×𝑟 𝑠. 𝑡. , 𝑑𝑗 2
2
≤ 1, 𝑗 ∈ 1, 𝑟 } ..….(2)
The objective function for this recovery step can be formulated as follows:
min
𝛼 𝑖∈𝑅 𝑟
1
2
𝑀𝑖(𝑥𝑖 − 𝐷𝛼𝑖) 2
2
+ λ′
𝛼𝑖 0 ∀𝑖 ∈ [1, 𝑀] ……(3)
We form an over complete dictionary by setting a small patch size (s), and
therefore, r > s
2
. In our experiments, the following parameter settings were
used: s = 8, p = 2, m = 5×10
5
, and r = 512. The total number of patches (M)
was ∼2 ×10
9
and ∼3.8×10
9
for Db-37 and Db-36, respectively.
13. Continue..
(Left) Gap filling output. (Right) Masked sparse reconstruction step reduces
noise and removes boundary effects caused by the gap filling.
16. CHANGE DETECTION
We formulate the change detection task as a region classification problem.
■ Identify change area proposals
■ Apply a deep CNN to detect the change or no change.
More specifically, we consider the healthy forest cover under normal
conditions as a no-change region.
Change Detection Stages:
■ Multiscale Region Proposal Generation
■ Candidate Suppression
■ Deep Convolutional Neural Networks
17. MULTISCALE REGION PROPOSAL GENERATION
Box proposals are generated at multiple scales to capture all sizes of change
events. The constants “H” and “W” denote the height and the width of the
original image, respectively.
18. DEEP CONVOLUTIONAL NEURAL NETWORKS
The network takes a series of patches [P(i −t)...P(i +t)] centered at a given
time instance for each change area proposal. The feature representations are
fused together after the first FC layer using a max-pooling operation to
produce temporally consistent and smooth features.
19. EXPERIMENTAL SETTINGS
■ We use a combination of Bands 5, 4, and 1 from the Landsat 7 imagery
and Bands 6, 5, and 2 from the Landsat 8 imagery for training and testing.
■ These band combinations for Landsats 7 and 8 are suitable for natural-
looking visualization of vegetation and fires.
■ To enhance the contrast of the image, we perform a uniform rescaling of
the red, green, and blue channels within the ranges of 0.0055–0.0463,
0.0132–0.0600, and 0.0029–0.0175, respectively.
20. RESULTS
The ground-truth and the predicted change/no-change labels are shown on
the top left corner in blue and red colors, respectively. Digits 1–3 on the top
left represent no change, fire, and harvest, respectively.
23. CONCLUSION
■ Our proposed approach is capable of performing change analysis at a
much finer temporal resolution and automatically learns strong features
from the raw surface reflectance data.
■ To achieve a finer temporal resolution, we perform data inpainting using
the reliable data values and sparse coding.
■ For change detection, our approach works on the object level by
identifying a candidate set of change regions using multiresolution area
profiles.
■ We use both the spatial and the temporal contextual information in the
deep CNN model, which helps in making better predictions.
■ Our method can precisely localize the change regions and predict its
timing accurately within an error margin of three to six months.
24. FUTURE SCOPE
■ In the future, the possibility of creating a large-scale annotated data set
will be investigated.
■ This will enable the training of large-scale data-driven models from
scratch.
■ Since interesting changes are scarce in practical settings, we will also
investigate class-imbalanced learning of deep networks for change
detection.
25. REFERENCES
■ M. Hussain, D. Chen, A. Cheng, H. Wei, and D. Stanley, “Change
detection from remotely sensed images: From pixel-based to objectbased
approaches,” ISPRS J. Photogramm. Remote Sens., vol. 80, pp. 91–106,
Jun. 2013.
■ G. Chen, G. J. Hay, L. M. T. Carvalho, and M. A. Wulder, “Object-based
change detection,” Int. J. Remote Sens., vol. 33, no. 14, pp. 4434–4457,
2012.
■ J.-F. Mas, “Monitoring land-cover changes: A comparison of change
detection techniques,” Int. J. Remote Sens., vol. 20, no. 1, pp. 139–152,
1999.
■ Z. Zhu and C. E. Woodcock, “Object-based cloud and cloud shadow
detection in Landsat imagery,” Remote Sens. Environ., vol. 118, pp. 83–
94, Mar. 2012.
26. Continue..
■ O. A. B. Penatti, K. Nogueira, and J. A. dos Santos, “Do deep features
generalize from everyday objects to remote sensing and aerial scenes
domains?” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
Workshops, Jun. 2015, pp. 44–51.
■ L. Gueguen and R. Hamid, “Large-scale damage detection using satellite
imagery,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015,
pp. 1321–1328.
■ J. Wang, J. Song, M. Chen, and Z. Yang, “Road network extraction: A
neural-dynamic framework based on deep learning and a finite state
machine,” Int. J. Remote Sens., vol. 36, no. 12, pp. 3144–3169, 2015.
■ S. H. Khan, M. Hayat, M. Bennamoun, F. Sohel, and R. Togneri. (2015).
“Cost sensitive learning of deep feature representations from imbalanced
data.” [Online]. Available: https://arxiv.org/abs/1508.03422