Science

ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225 – 238
www.elsevier.com/locate/isprsjprs

Object-based classification of remote sensing data
for change detection
Volker Walter *
Institute for Photogrammetry, University of Stuttgart, Geschwister-Scholl-Str. 24 D, Stuttgart D-70174, Germany

Received 31 January 2003; accepted 26 September 2003

Abstract

In this paper, a change detection approach based on an object-based classification of remote sensing data is introduced. The
approach classifies not single pixels but groups of pixels that represent already existing objects in a GIS database. The approach
is based on a supervised maximum likelihood classification. The multispectral bands grouped by objects and very different
measures that can be derived from multispectral bands represent the n-dimensional feature space for the classification. The
training areas are derived automatically from the geographical information system (GIS) database.
After an introduction into the general approach, different input channels for the classification are defined and discussed. The
results of a test on two test areas are presented. Afterwards, further measures, which can improve the result of the classification
and enable the distinction between more land-use classes than with the introduced approach, are presented.
D 2003 Elsevier B.V. All rights reserved.

Keywords: change detection; classification; object-oriented image analysis; data fusion

1. Introduction the real world is very small compared with the number
of all GIS objects in the database. This assumption is
In Walter and Fritsch (2000), a concept for the justified because we want to realise update cycles in
automatic revision of geographical information sys- the range of several months.
tem (GIS) databases using multispectral remote sens- In a second step, the classified remote sensing data
ing data was introduced. This approach can be have to be matched with the existing GIS objects in
subdivided into two steps (see Fig. 1). In a first step, order to find those objects where a change occurred, or
remote sensing data are classified with a supervised which were collected wrongly. We solved this task by
maximum likelihood classification into different land- measuring per object the percentage, homogeneity, and
use classes. The training areas are derived from an form of the pixels, which are classified to the same
already existing GIS database in order to avoid the object class as the respective object stored in the
time-consuming task of manual acquisition. This can database (Walter, 2000). All objects are classified into
be done if it is assumed that the number of changes in the classes fully verified, partly verified, and not found
by using thresholds that can be defined interactively by
the user.
* Tel.: +49-711-121-4091; fax: +49-711-121-3297. The problem of using thresholds is that they are
E-mail address: Volker.Walter@ifp.uni-stuttgart.de (V. Walter). data-dependent. For example, the percentage of veg-

0924-2716/$ - see front matter D 2003 Elsevier B.V. All rights reserved.
doi:10.1016/j.isprsjprs.2003.09.007

226 V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238

Fig. 1. Pixel-based classification approach.

etation pixels varies significantly between data that pixels, we have to define new measures that can be
are captured in summer or in winter. Other influencing very simple (e.g., the mean grey value of all pixels of
factors are light and weather conditions, soil type, or an object in a specific channel) but also very complex,
daytime. Therefore, we cannot use the same thresh- like measures that describe the form of an object. This
olds for different datasets. In order to avoid the approach is very flexible because it can combine very
problem of defining data-dependent thresholds, we different measures for describing an object. We can
introduce an object-based supervised classification even use the result of a pixel-based classification and
approach. The object-based classification works in count for each object the percentage of pixels that are
the same way as a pixel-based classification (see classified to a specific land-use class.
Fig. 2), with the difference that we do not classify Because the result of the approach is a classifica-
each pixel but combine all pixels of each object and tion into the most likely class, the problematic part of
classify them together. Again, the training areas for matching is now replaced by a single comparison of
the classification of the objects are derived from the the classification result with the GIS database without
existing database in order to avoid a time-consuming using any thresholds.
manual acquisition.
In a ‘‘normal’’ classification, the greyscale values 1.1. Related work
of each pixel in different multispectral channels and
possibly some other preprocessed texture channels are This kind of approach is an object-oriented image
used as input. For the classification of groups of analysis that is also successfully applied to other

V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238 227

Fig. 2. Differences between object-based and pixel-based classification.

problems. A good overview of different approaches with our approach is that no thresholds are used in
can be found in Blaschke et al. (2000). These our approach.
approaches can be subdivided into approaches that
use existing GIS data to superimpose it on an image
(per-field or per-parcel classification), and approaches 2. Object-based classification
that use object-oriented classification rules without
any GIS input. Approaches that use existing GIS data 2.1. Input data
are not very widely used today. In Aplin et al. (1999),
an example for a per-field classification approach is The following tests were carried out with ATKIS
introduced, which first classifies the image into datasets. ATKIS is the German national topographic
different land-use classes. Afterwards, the fields and cartographic database, and captures the landscape
(which represent forest parcels from a GIS database) in the scale of 1:25,000 (AdV, 1988). In Walter
are subdivided into different classes, depending on (1999), it was shown that a spatial resolution of at
the classification result, by using thresholds. The least 2 m is needed to update data in the scale of
main difference of existing approaches compared 1:25,000. The remote sensing data were captured with


the DPA system, which is an optical airborne digital class roads is only used in the first step in the process
camera (Hahn et al., 1996). The original resolution of for the pixel-based classification. Because of the
0.5 m was resampled to a resolution of 2 m. The DPA linear shape, roads consist of many mixed pixels in
system has four multispectral channels [blue 440 –525 a resolution of 2 m and have to be checked with other
nm, green 520 – 600 nm, red 610 – 685 nm, near- techniques (see Walter, 1998).
infrared (NIR) 770– 890 nm].
2.3. Input channels
2.2. Classification classes
Like in a pixel-based classification, we can use all
Currently, 63 different object classes are collected spectral bands as input channels. The difference is that
in ATKIS. There are a lot of object classes that can in the pixel-based classification, each pixel is classi-
have very similar appearances in an image of 2 m fied separately, whereas in the object-based classifi-
pixel size (e.g., industrial areas, residential areas, or cation, all pixels that belong to one GIS object are
areas of mixed use). Therefore, we do not use 63 land- grouped together. In order to analyse the spectral
use classes for the classification but subdivide all behaviour of objects, we calculate the mean grey
object classes into the five land-use classes: water, value of each channel for all GIS objects. Fig. 3
forest, settlement, greenland, and roads. The land-use shows as an example the original input data (b) and

Fig. 3. Input data for (a) object-based and (b) pixel-based classification.


the mean RGB (red green blue) value (a) of each GIS Different land-use classes cannot be distinguished
object. The result of the pixel grouping is like a only by their spectral behaviour but also by their
smoothing of the data. The spectral behaviour of the different textures. Texture operators transform input
objects is similar to the typical spectral behaviour of images in such a way that the texture is coded in grey
the pixels. For example, forest areas are represented in values. In our approach, we use a texture operator
the green channel by dark pixel/objects, whereas based on a co-occurrence matrix that measures the
settlements are represented by bright pixel/objects. contrast in a 5 Â 5 pixel window. Fig. 5 shows the
This behaviour can be also seen in Fig. 4. The used texture operator in an example. The input image
scatterplots show the distribution of (a) the grey values is shown in Fig. 5a, the texture (calculated from the
of settlement and forest pixels compared with the blue band) in Fig. 5b, and the average object textures
distribution of (b) the mean grey value of settlement in Fig. 5c. Settlements are represented with dark
and forest objects in the channels red and NIR. It can pixels, greenlands with bright pixels, and forests with
be seen that the behaviour is similar but the separation middle grey pixels.
of the two classes becomes blurred because of the The variance of the grey values of the pixels of an
smoothing effect. In the object-based classification, all object is also a good indicator of the roughness of a
multispectral bands of the DPA camera system (blue, texture. Fig. 6 shows the calculated mean variance in
green, red, and NIR) are used as input channels. the blue band for all objects. Settlement objects have

Fig. 4. Scatterplot of (a) pixels vs. (b) objects.


Fig. 5. (a) Input image, (b) texture blue band, and (c) average object texture.

high variance, greenland objects have middle variance, the classification result. They are based on the spectral
and forest objects have low variance. Fig. 7 shows the behaviour of chlorophyll, which absorbs red light and
behaviour of the variance in the different bands: blue, reflects NIR light. In our approach, we employ the most
green, red, and NIR. The best discrimination between widely used normalised difference (Campbell, 1987):
land-use classes using the variance can be seen in the
blue band. In the NIR band, all land-use classes have a IR À R
VI ¼ ð1Þ
similar distribution, which makes discrimination in this IR þ R
band impossible.
Vegetation indices are very often used in pixel- Fig. 8a shows the calculated vegetation index for
based classification as an input channel to improve pixels and Fig. 8b for objects. It can be seen that


Fig. 6. Mean variance of GIS objects in blue band.

settlements are represented typically by dark areas, An interesting visualisation of the feature space of
whereas forests are represented mostly by bright the object-based classification can be made with the
areas. The classification of greenlands is difficult combination of three object-based evaluations of the
because they can be represented by very bright areas pixel-based classification. In Fig. 10, the percentage
(e.g., fields with a high amount of vegetation) as well of settlement pixels is assigned to the red band, the
as by very dark areas (e.g., fields shortly after the percentage of forest pixels to the green band, and the
harvest). percentage of greenland pixels to the blue band of an
All so far defined input channels are also used in RGB image. The combination of these three bands
‘‘normal’’ pixel-based classification. In object-based shows that the pixel-based classification of forests and
classification, it is possible to add further input greenlands is very reliable, which can be seen on the
channels, which do not describe directly spectral or bright green and blue colour of the corresponding
textural characteristics. For example, we can use the objects. Settlement areas in contrast cannot be classi-
result of a pixel-based classification and count the fied as homogenous areas. Therefore, settlement
percentage of pixels that are classified to a specific objects are represented in a reddish colour that can
land-use class. This evaluation is shown in Fig. 9. The be brownish or purple.
input image is shown in Fig. 9a and the pixel-based
classification result in Fig. 9b. Fig. 9c shows for each
object the percentage of pixels that are classified to 3. Classification results
the land-use class forest. White colour represents
100% and black colour represents 0%. In Fig. 9b The approach was tested on two test areas (16 and
and c, it can be seen that forest is a land-use class that 9.1 km2), which were acquired at different dates with
can be classified with high accuracy in pixel-based as a total of 951 objects (194 forests, 252 greenlands,
well as object-based classifications. Fig. 9d shows the 497 settlements, and 8 water objects). The input
percentage of settlement pixels. Because of the high channels were:
resolution (2 m) of the data, settlements cannot be
detected as homogenous areas but they are split into mean grey value blue band
different land-use classes depending on what the mean grey value green band
pixels are actually representing. Therefore, settlement mean grey value red band
objects contain typically only 50 – 70% settlement mean grey value NIR band
pixels in 2-m resolution images. This can be also seen mean grey value vegetation index
in Fig. 9e, which shows the percentage of greenland mean grey value texture from blue band
pixels. Whereas greenlands contain up to 100% green- variance blue band
land pixels, it can be seen that, in settlement areas, variance green band
pixels are also classified as greenlands. variance red band

232 V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238

variance NIR band
variance vegetation index
variance texture
percentage forest pixel
percentage greenland pixel
percentage settlement pixel
percentage water pixel.

The input channels span a 16-dimensional feature
space. All objects of the test areas are used as training
objects for the classification. That means that those
objects are also training objects that are wrong in the
database. In a manual revision, we compared the GIS
data with the images. The number of objects that were
not collected correctly, or where it was not possible to
decide if they are collected correctly without further
information sources is 63, which is more than 6% of
all objects. The average percentage of changes in
topographic maps in western Europe per year are
6.4% in scale 1:50,000, 7.4% in scale 1:25,000 and
8% in scale 1:1,000,000 (Konecny, 1996). Therefore,
the approach is robust enough if we want to update the
GIS database in 1-year cycles.
Fig. 11a shows the GIS data and Fig. 11b shows
the result of the object-based classification on a part of
one test area. Altogether, 82 objects (which are 8.6%
of all objects) were classified into a different land-use
class than the one assigned to them in the GIS
database.
These objects were subdivided manually into three
classes. The first class contains all objects where a
change in the landscape has happened and an update
in the GIS database has to be done. In this class, there
are 37 objects (45%). The second class contains all
objects where it is not clear if the GIS objects were
collected correctly. Higher-resolution data or some-
times even field inspections are needed to decide if the
GIS database has to be updated or not. In this class,
there are 26 objects (31%). The third class contains all
objects where the result of the classification is incor-
rect. In this class, there are 19 objects (23%).

4. Further work

The approach subdivides all objects into the classes
Fig. 7. Object variance in different bands (x-axis, variance; y-axis, water, forest, settlement, and greenland. This can be
number of objects). refined if more object characteristics are evaluated. In

V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 233

Fig. 8. Vegetation index for (a) single pixels and (b) objects.

the following, we suggest three possible extensions of percentages of chlorophyll. The four input channels,
the approach. which were calculated from the result of the pixel-
based classification (percentage forest pixels, percent-
4.1. Additional use of laser data age greenland pixel, percentage settlement pixels, and
percentage water pixels), are the channels with the
In Haala and Walter (1999), it was shown that the highest amount of influence for the object-based
result of a pixel-based classification can be improved classification. Therefore, the object-based classifica-
significantly by the combined use of multispectral and tion should also be improved by the combined use of
laser data. Fig. 12 shows a pixel-based classification multispectral and laser data.
result of a CIR (colored infrared) image with (b) and With laser data, further input channels can be
without (c) the use of laser data as an additional calculated like slope, average object height, average
channel. The laser data improve the classification object slope, etc. With high-density laser data, it could
result because they have a complementary ‘‘behav- be possible to distinguish, for example, between
iour’’ to the multispectral data. With laser data, the residential areas and industrial areas. Fig. 13 shows
classes greenland and road can be separated very well a laser profile (1 m raster width) of a residential area
from the classes forest and settlement because of the (a) and an industrial area (b). In residential areas, there
different heights of the pixels above the ground, are typically houses with sloped roofs and a lot of
whereas in multispectral data, the classes greenland vegetation between the houses, whereas in industrial
and forest can be separated very well from the classes areas, there are buildings with flat roofs and less
roads and settlement because of the strongly different vegetation. This characteristic can be described by a


two-dimensional evaluation of the slope directions of
each object and could be also useful to distinguish
between different types of vegetation.
The fusion of data from different sensors for
image segmentation is a relatively new field (Pohl
and van Genderen, 1998). The general aim is to
increase the information content in order to make the
segmentation easier. Instead of laser data, it could be
also possible to make a fusion with SAR data (e.g.,
see Dupas, 2000).

4.2. More texture measures

At the moment, we use a co-occurrence matrix,
mean variance, and mean contrast to describe the
texture of objects. These texture measures can be also
used in pixel-based classification by measuring the
variance and contrast of each pixel in an n Â n
window. The problem of a window with a fixed size
is that mixed pixels at the object borders are classified
very often to a wrong land-use class. The larger is the
window, the more pixels will be classified wrongly.
This problem does not appear in object-based classi-
fication because we do not evaluate a window with a
fixed size but use the existing object geometry (in
order not to use mixed pixels at the object boarder, a
buffer is used and border pixels are removed). There-
fore, we suggest using more texture measures. Fig. 14
shows an example of a possible evaluation of the
texture. The images are processed with a Sobel
operator. Typically, farmland objects contain many
edges with one main edge direction (a), whereas in
forest objects, the direction of the edges is equally
distributed (b) and in settlement objects, several main
directions can be found (c). Other texture measures
could be, for example, the average length or contrast
of the edges. However, several tests have to be
performed in order to prove these ideas.

4.3. Use of multitemporal data

The main reason that the approach classifies
objects into a wrong class is that in practice, the

Fig. 9. Percentage right classified pixel. (a) Input image, (b) pixel-
based classification result, (c) percentage right classified forest pixels,
(d) percentage right classified settlement pixels, (e) percentage right
classified greenland pixels.


Fig. 10. Visualisation of the feature space of the object-based classification.

appearance of objects can be very inhomogeneous. If, of single pixels but on whole object structures. There-
for example, a settlement object contains large areas fore, we do not classify only single pixels but groups
of greenland but only few pixels that represent a of pixels that represent already existing objects in a
house or a road, it will be classified as greenland GIS database. Each object is described by an n-
and not as settlement. The object will be marked as an dimensional feature vector and classified to the most
updated object and an operator has to check the object likely class based on a supervised maximum likeli-
each time the data are revised because the approach hood classification. The object-based classification
will classify the object every time as greenland. needs no tuning parameters like user-defined thresh-
A solution for this problem is to store all param- olds. It works fully automatically because all infor-
eters of the n-dimensional feature space (mean grey mation for the classification is derived from
values, mean variance, etc.) of an object when it is automatically generated training areas. The result is
checked for the first time. If, then, later the object is not only a change detection but also a classification
marked again as an update, the program can measure into the most likely land-use class.
the distance of the object in the current and the earlier The results show that approximately 8.6% of all
stored feature space. If the distance is under a specific objects (82 objects from 951) are marked as changes.
threshold, it can be assumed that the object is still the From these 82 objects, 45% are real changes, 31% are
same and therefore does not have to be updated. potential changes, and 23% are wrongly classified.
That means that the amount of interactive checking of
the data can be decreased significantly. On the other
5. Conclusion hand, we have to ask if the object-based classification
finds all changes. A change in the landscape can only
The basic idea of the approach is that image be detected if it affects a large part of an object
interpretation is not based only on the interpretation because the object-based classification uses the exist-


Fig. 11. (a) GIS data and (b) result of the classification.

Fig. 12. (a) Input image, (b) classification with multispectral data, and (c) classification with multispectral and laser data.


Fig. 13. Laser profiles of (a) a residential and (b) an industrial area.

ing object geometry. If, for example, a forest object land-use class. The same approach could be used for
has a size of 5000 m2 and in that forest object a small water areas because water is also a land-use class that
settlement area with 200 m2 is built up, then this can be classified very accurately in pixel-based clas-
approach will fail. sification. More difficult is the situation for the land-
Further techniques have to be developed in order to use classes greenland and settlement, which have
cover this problem. Because forest areas can be typically an inhomogeneous appearance in a pixel-
classified very accurately in pixel-based classification, based classification. Here, we suggest using a multi-
it could be additionally tested whether there are large scale approach to make additional verification of the
areas in a forest object that are classified to another objects (e.g., see Heipke and Straub, 1999).

Fig. 14. Different gradient directions for (a) greenland, (b) forest, (c) settlement.


Up to now, we can only distinguish between the ments using LIDAR and color aerial imagery. International
land-use classes forest, settlement, greenland, and Archives for Photogrammetry and Remote Sensing XXXII
(Part 7-4-3W6), 76 – 82.
water. This can be refined if more object character- Hahn, M., Stallmann, D., Staetter, C., 1996. The DPA-sensor
istics are evaluated. Some possible object character- system for topographic and thematic mapping. International
istics are defined in this paper and have to be tested in Archives of Photogrammetry and Remote Sensing XXXI
future work. (Part B2), 141 – 146.
Heipke, C., Straub, B.-M., 1999. Relations between multi scale
imagery and GIS aggregation levels for the automatic extrac-
tion of vegetation areas. Proceedings of the ISPRS Joint Work-
References shop on ‘‘Sensors and Mapping from Space’’, Hannover. On
CD-ROM.
Aplin, P., Atkinson, P., Curran, P., 1999. Per-field classification of Konecny, G., 1996. Hochauflosende Fernerkundungssensoren fur
¨ ¨
landuse using the forthcoming very fine resolution satellite sen- kartographische Anwendungen in Entwicklungslander. ZPF 64
¨
sors: problems and potential solutions. In: Atkinson, P., Tate, N. (2), 39 – 51.
(Eds.), Advances in Remote Sensing and GIS Analysis. Wiley, Pohl, C., van Genderen, J., 1998. Multisensor image fusion in
Chichester, pp. 219 – 239. remote sensing: concepts, methods and applications. Interna-
Arbeitsgemeinschaft der Vermessungsverwaltungen der Lander der
¨ tional Journal on Remote Sensing 19 (5), 823 – 864.
Bundesrepublik Deutschland (AdV), 1988. Amtlich Topogra- Walter, V., 1998. Automatic classification of remote sensing data
phisches-Kartographisches Informationssystem (ATKIS). Land- for GIS database revision. International Archives for Photo-
esvermessungsamt Nordrhein-Westfalen, Bonn. grammetry and Remote Sensing XXXII (Part 4), 641 – 648.
Blaschke, T., Lang, S., Lorup, E., Strobl, J., Zeil, P., 2000. Object- Walter, V., 1999. Comparison of the potential of different sensors
oriented image processing in an integrated GIS/remote sensing for an automatic approach for change detection in GIS data-
environment and perspectives for environmental applications. bases. Lecture Notes in Computer Science, Integrated Spatial
In: Cremers, A., Greve, K. (Eds.), Environmental Information Databases: Digital Images and GIS, International Workshop
for Planning, Politics and the Public, vol. II. Metropolis-Verlag, ISD ’99. Springer, Heidelberg, pp. 47 – 63.
Marburg, pp. 555 – 570. Walter, V., 2000. Automatic change detection in GIS databases
Campbell, J.B., 1987. Introduction into Remote Sensing. The based on classification of multispectral data. International
Guildford Press, New York. Archives of Photogrammetry and Remote Sensing XXXIII
Dupas, C.A., 2000. SAR and LANDSAT TM image fusion for land (Part B4), 1138 – 1145.
cover classification in the Brazilian Atlantic Forest Domain. Walter, V., Fritsch, D., 2000. Automatic verification of GIS data
International Archives for Photogrammetry and Remote Sensing using high resolution multispectral data. International Archives
XXXIII (Part B1), 96 – 103. of Photogrammetry and Remote Sensing XXXII (Part 3/1),
Haala, N., Walter, V., 1999. Classification of urban environ- 485 – 489.

Science

Recommended

Recommended

More Related Content

What's hot

What's hot (12)

Viewers also liked

Viewers also liked (7)

Similar to Science

Similar to Science (20)

Recently uploaded

Recently uploaded (20)

Science