ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225 – 238 www.elsevier.com/locate/isprsjprs Object-based classification of remote sensing data for change detection Volker Walter * Institute for Photogrammetry, University of Stuttgart, Geschwister-Scholl-Str. 24 D, Stuttgart D-70174, Germany Received 31 January 2003; accepted 26 September 2003Abstract In this paper, a change detection approach based on an object-based classification of remote sensing data is introduced. Theapproach classifies not single pixels but groups of pixels that represent already existing objects in a GIS database. The approachis based on a supervised maximum likelihood classification. The multispectral bands grouped by objects and very differentmeasures that can be derived from multispectral bands represent the n-dimensional feature space for the classification. Thetraining areas are derived automatically from the geographical information system (GIS) database. After an introduction into the general approach, different input channels for the classification are defined and discussed. Theresults of a test on two test areas are presented. Afterwards, further measures, which can improve the result of the classificationand enable the distinction between more land-use classes than with the introduced approach, are presented.D 2003 Elsevier B.V. All rights reserved.Keywords: change detection; classification; object-oriented image analysis; data fusion1. Introduction the real world is very small compared with the number of all GIS objects in the database. This assumption is In Walter and Fritsch (2000), a concept for the justified because we want to realise update cycles inautomatic revision of geographical information sys- the range of several months.tem (GIS) databases using multispectral remote sens- In a second step, the classified remote sensing dataing data was introduced. This approach can be have to be matched with the existing GIS objects insubdivided into two steps (see Fig. 1). In a first step, order to find those objects where a change occurred, orremote sensing data are classified with a supervised which were collected wrongly. We solved this task bymaximum likelihood classification into different land- measuring per object the percentage, homogeneity, anduse classes. The training areas are derived from an form of the pixels, which are classified to the samealready existing GIS database in order to avoid the object class as the respective object stored in thetime-consuming task of manual acquisition. This can database (Walter, 2000). All objects are classified intobe done if it is assumed that the number of changes in the classes fully verified, partly verified, and not found by using thresholds that can be defined interactively by the user. * Tel.: +49-711-121-4091; fax: +49-711-121-3297. The problem of using thresholds is that they are E-mail address: Volker.Walter@ifp.uni-stuttgart.de (V. Walter). data-dependent. For example, the percentage of veg-0924-2716/$ - see front matter D 2003 Elsevier B.V. All rights reserved.doi:10.1016/j.isprsjprs.2003.09.007
226 V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238 Fig. 1. Pixel-based classification approach.etation pixels varies significantly between data that pixels, we have to define new measures that can beare captured in summer or in winter. Other influencing very simple (e.g., the mean grey value of all pixels offactors are light and weather conditions, soil type, or an object in a specific channel) but also very complex,daytime. Therefore, we cannot use the same thresh- like measures that describe the form of an object. Thisolds for different datasets. In order to avoid the approach is very flexible because it can combine veryproblem of defining data-dependent thresholds, we different measures for describing an object. We canintroduce an object-based supervised classification even use the result of a pixel-based classification andapproach. The object-based classification works in count for each object the percentage of pixels that arethe same way as a pixel-based classification (see classified to a specific land-use class.Fig. 2), with the difference that we do not classify Because the result of the approach is a classifica-each pixel but combine all pixels of each object and tion into the most likely class, the problematic part ofclassify them together. Again, the training areas for matching is now replaced by a single comparison ofthe classification of the objects are derived from the the classification result with the GIS database withoutexisting database in order to avoid a time-consuming using any thresholds.manual acquisition. In a ‘‘normal’’ classification, the greyscale values 1.1. Related workof each pixel in different multispectral channels andpossibly some other preprocessed texture channels are This kind of approach is an object-oriented imageused as input. For the classification of groups of analysis that is also successfully applied to other
V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238 227 Fig. 2. Differences between object-based and pixel-based classification.problems. A good overview of different approaches with our approach is that no thresholds are used incan be found in Blaschke et al. (2000). These our approach.approaches can be subdivided into approaches thatuse existing GIS data to superimpose it on an image(per-field or per-parcel classification), and approaches 2. Object-based classificationthat use object-oriented classification rules withoutany GIS input. Approaches that use existing GIS data 2.1. Input dataare not very widely used today. In Aplin et al. (1999),an example for a per-field classification approach is The following tests were carried out with ATKISintroduced, which first classifies the image into datasets. ATKIS is the German national topographicdifferent land-use classes. Afterwards, the fields and cartographic database, and captures the landscape(which represent forest parcels from a GIS database) in the scale of 1:25,000 (AdV, 1988). In Walterare subdivided into different classes, depending on (1999), it was shown that a spatial resolution of atthe classification result, by using thresholds. The least 2 m is needed to update data in the scale ofmain difference of existing approaches compared 1:25,000. The remote sensing data were captured with
228 V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238the DPA system, which is an optical airborne digital class roads is only used in the first step in the processcamera (Hahn et al., 1996). The original resolution of for the pixel-based classification. Because of the0.5 m was resampled to a resolution of 2 m. The DPA linear shape, roads consist of many mixed pixels insystem has four multispectral channels [blue 440 –525 a resolution of 2 m and have to be checked with othernm, green 520 – 600 nm, red 610 – 685 nm, near- techniques (see Walter, 1998).infrared (NIR) 770– 890 nm]. 2.3. Input channels2.2. Classification classes Like in a pixel-based classification, we can use all Currently, 63 different object classes are collected spectral bands as input channels. The difference is thatin ATKIS. There are a lot of object classes that can in the pixel-based classification, each pixel is classi-have very similar appearances in an image of 2 m fied separately, whereas in the object-based classifi-pixel size (e.g., industrial areas, residential areas, or cation, all pixels that belong to one GIS object areareas of mixed use). Therefore, we do not use 63 land- grouped together. In order to analyse the spectraluse classes for the classification but subdivide all behaviour of objects, we calculate the mean greyobject classes into the five land-use classes: water, value of each channel for all GIS objects. Fig. 3forest, settlement, greenland, and roads. The land-use shows as an example the original input data (b) and Fig. 3. Input data for (a) object-based and (b) pixel-based classification.
V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238 229the mean RGB (red green blue) value (a) of each GIS Different land-use classes cannot be distinguishedobject. The result of the pixel grouping is like a only by their spectral behaviour but also by theirsmoothing of the data. The spectral behaviour of the different textures. Texture operators transform inputobjects is similar to the typical spectral behaviour of images in such a way that the texture is coded in greythe pixels. For example, forest areas are represented in values. In our approach, we use a texture operatorthe green channel by dark pixel/objects, whereas based on a co-occurrence matrix that measures thesettlements are represented by bright pixel/objects. contrast in a 5 Â 5 pixel window. Fig. 5 shows the This behaviour can be also seen in Fig. 4. The used texture operator in an example. The input imagescatterplots show the distribution of (a) the grey values is shown in Fig. 5a, the texture (calculated from theof settlement and forest pixels compared with the blue band) in Fig. 5b, and the average object texturesdistribution of (b) the mean grey value of settlement in Fig. 5c. Settlements are represented with darkand forest objects in the channels red and NIR. It can pixels, greenlands with bright pixels, and forests withbe seen that the behaviour is similar but the separation middle grey pixels.of the two classes becomes blurred because of the The variance of the grey values of the pixels of ansmoothing effect. In the object-based classification, all object is also a good indicator of the roughness of amultispectral bands of the DPA camera system (blue, texture. Fig. 6 shows the calculated mean variance ingreen, red, and NIR) are used as input channels. the blue band for all objects. Settlement objects have Fig. 4. Scatterplot of (a) pixels vs. (b) objects.
230 V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238 Fig. 5. (a) Input image, (b) texture blue band, and (c) average object texture.high variance, greenland objects have middle variance, the classification result. They are based on the spectraland forest objects have low variance. Fig. 7 shows the behaviour of chlorophyll, which absorbs red light andbehaviour of the variance in the different bands: blue, reflects NIR light. In our approach, we employ the mostgreen, red, and NIR. The best discrimination between widely used normalised difference (Campbell, 1987):land-use classes using the variance can be seen in theblue band. In the NIR band, all land-use classes have a IR À R VI ¼ ð1Þsimilar distribution, which makes discrimination in this IR þ Rband impossible. Vegetation indices are very often used in pixel- Fig. 8a shows the calculated vegetation index forbased classification as an input channel to improve pixels and Fig. 8b for objects. It can be seen that
V. Walter / ISPRS Journal of Photogrammetry & Remote Sensing 58 (2004) 225–238 231 Fig. 6. Mean variance of GIS objects in blue band.settlements are represented typically by dark areas, An interesting visualisation of the feature space ofwhereas forests are represented mostly by bright the object-based classification can be made with theareas. The classification of greenlands is difficult combination of three object-based evaluations of thebecause they can be represented by very bright areas pixel-based classification. In Fig. 10, the percentage(e.g., fields with a high amount of vegetation) as well of settlement pixels is assigned to the red band, theas by very dark areas (e.g., fields shortly after the percentage of forest pixels to the green band, and theharvest). percentage of greenland pixels to the blue band of an All so far defined input channels are also used in RGB image. The combination of these three bands‘‘normal’’ pixel-based classification. In object-based shows that the pixel-based classification of forests andclassification, it is possible to add further input greenlands is very reliable, which can be seen on thechannels, which do not describe directly spectral or bright green and blue colour of the correspondingtextural characteristics. For example, we can use the objects. Settlement areas in contrast cannot be classi-result of a pixel-based classification and count the fied as homogenous areas. Therefore, settlementpercentage of pixels that are classified to a specific objects are represented in a reddish colour that canland-use class. This evaluation is shown in Fig. 9. The be brownish or purple.input image is shown in Fig. 9a and the pixel-basedclassification result in Fig. 9b. Fig. 9c shows for eachobject the percentage of pixels that are classified to 3. Classification resultsthe land-use class forest. White colour represents100% and black colour represents 0%. In Fig. 9b The approach was tested on two test areas (16 andand c, it can be seen that forest is a land-use class that 9.1 km2), which were acquired at different dates withcan be classified with high accuracy in pixel-based as a total of 951 objects (194 forests, 252 greenlands,well as object-based classifications. Fig. 9d shows the 497 settlements, and 8 water objects). The inputpercentage of settlement pixels. Because of the high channels were:resolution (2 m) of the data, settlements cannot bedetected as homogenous areas but they are split into mean grey value blue banddifferent land-use classes depending on what the mean grey value green bandpixels are actually representing. Therefore, settlement mean grey value red bandobjects contain typically only 50 – 70% settlement mean grey value NIR bandpixels in 2-m resolution images. This can be also seen mean grey value vegetation indexin Fig. 9e, which shows the percentage of greenland mean grey value texture from blue bandpixels. Whereas greenlands contain up to 100% green- variance blue bandland pixels, it can be seen that, in settlement areas, variance green bandpixels are also classified as greenlands. variance red band
232 V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 variance NIR band variance vegetation index variance texture percentage forest pixel percentage greenland pixel percentage settlement pixel percentage water pixel. The input channels span a 16-dimensional feature space. All objects of the test areas are used as training objects for the classification. That means that those objects are also training objects that are wrong in the database. In a manual revision, we compared the GIS data with the images. The number of objects that were not collected correctly, or where it was not possible to decide if they are collected correctly without further information sources is 63, which is more than 6% of all objects. The average percentage of changes in topographic maps in western Europe per year are 6.4% in scale 1:50,000, 7.4% in scale 1:25,000 and 8% in scale 1:1,000,000 (Konecny, 1996). Therefore, the approach is robust enough if we want to update the GIS database in 1-year cycles. Fig. 11a shows the GIS data and Fig. 11b shows the result of the object-based classification on a part of one test area. Altogether, 82 objects (which are 8.6% of all objects) were classified into a different land-use class than the one assigned to them in the GIS database. These objects were subdivided manually into three classes. The first class contains all objects where a change in the landscape has happened and an update in the GIS database has to be done. In this class, there are 37 objects (45%). The second class contains all objects where it is not clear if the GIS objects were collected correctly. Higher-resolution data or some- times even field inspections are needed to decide if the GIS database has to be updated or not. In this class, there are 26 objects (31%). The third class contains all objects where the result of the classification is incor- rect. In this class, there are 19 objects (23%). 4. Further work The approach subdivides all objects into the classesFig. 7. Object variance in different bands (x-axis, variance; y-axis, water, forest, settlement, and greenland. This can benumber of objects). refined if more object characteristics are evaluated. In
V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 233 Fig. 8. Vegetation index for (a) single pixels and (b) objects.the following, we suggest three possible extensions of percentages of chlorophyll. The four input channels,the approach. which were calculated from the result of the pixel- based classification (percentage forest pixels, percent-4.1. Additional use of laser data age greenland pixel, percentage settlement pixels, and percentage water pixels), are the channels with the In Haala and Walter (1999), it was shown that the highest amount of influence for the object-basedresult of a pixel-based classification can be improved classification. Therefore, the object-based classifica-significantly by the combined use of multispectral and tion should also be improved by the combined use oflaser data. Fig. 12 shows a pixel-based classification multispectral and laser data.result of a CIR (colored infrared) image with (b) and With laser data, further input channels can bewithout (c) the use of laser data as an additional calculated like slope, average object height, averagechannel. The laser data improve the classification object slope, etc. With high-density laser data, it couldresult because they have a complementary ‘‘behav- be possible to distinguish, for example, betweeniour’’ to the multispectral data. With laser data, the residential areas and industrial areas. Fig. 13 showsclasses greenland and road can be separated very well a laser profile (1 m raster width) of a residential areafrom the classes forest and settlement because of the (a) and an industrial area (b). In residential areas, theredifferent heights of the pixels above the ground, are typically houses with sloped roofs and a lot ofwhereas in multispectral data, the classes greenland vegetation between the houses, whereas in industrialand forest can be separated very well from the classes areas, there are buildings with flat roofs and lessroads and settlement because of the strongly different vegetation. This characteristic can be described by a
234 V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 two-dimensional evaluation of the slope directions of each object and could be also useful to distinguish between different types of vegetation. The fusion of data from different sensors for image segmentation is a relatively new field (Pohl and van Genderen, 1998). The general aim is to increase the information content in order to make the segmentation easier. Instead of laser data, it could be also possible to make a fusion with SAR data (e.g., see Dupas, 2000). 4.2. More texture measures At the moment, we use a co-occurrence matrix, mean variance, and mean contrast to describe the texture of objects. These texture measures can be also used in pixel-based classification by measuring the variance and contrast of each pixel in an n Â n window. The problem of a window with a fixed size is that mixed pixels at the object borders are classified very often to a wrong land-use class. The larger is the window, the more pixels will be classified wrongly. This problem does not appear in object-based classi- fication because we do not evaluate a window with a fixed size but use the existing object geometry (in order not to use mixed pixels at the object boarder, a buffer is used and border pixels are removed). There- fore, we suggest using more texture measures. Fig. 14 shows an example of a possible evaluation of the texture. The images are processed with a Sobel operator. Typically, farmland objects contain many edges with one main edge direction (a), whereas in forest objects, the direction of the edges is equally distributed (b) and in settlement objects, several main directions can be found (c). Other texture measures could be, for example, the average length or contrast of the edges. However, several tests have to be performed in order to prove these ideas. 4.3. Use of multitemporal data The main reason that the approach classifies objects into a wrong class is that in practice, the Fig. 9. Percentage right classified pixel. (a) Input image, (b) pixel- based classification result, (c) percentage right classified forest pixels, (d) percentage right classified settlement pixels, (e) percentage right classified greenland pixels.
V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 235 Fig. 10. Visualisation of the feature space of the object-based classification.appearance of objects can be very inhomogeneous. If, of single pixels but on whole object structures. There-for example, a settlement object contains large areas fore, we do not classify only single pixels but groupsof greenland but only few pixels that represent a of pixels that represent already existing objects in ahouse or a road, it will be classified as greenland GIS database. Each object is described by an n-and not as settlement. The object will be marked as an dimensional feature vector and classified to the mostupdated object and an operator has to check the object likely class based on a supervised maximum likeli-each time the data are revised because the approach hood classification. The object-based classificationwill classify the object every time as greenland. needs no tuning parameters like user-defined thresh- A solution for this problem is to store all param- olds. It works fully automatically because all infor-eters of the n-dimensional feature space (mean grey mation for the classification is derived fromvalues, mean variance, etc.) of an object when it is automatically generated training areas. The result ischecked for the first time. If, then, later the object is not only a change detection but also a classificationmarked again as an update, the program can measure into the most likely land-use class.the distance of the object in the current and the earlier The results show that approximately 8.6% of allstored feature space. If the distance is under a specific objects (82 objects from 951) are marked as changes.threshold, it can be assumed that the object is still the From these 82 objects, 45% are real changes, 31% aresame and therefore does not have to be updated. potential changes, and 23% are wrongly classified. That means that the amount of interactive checking of the data can be decreased significantly. On the other5. Conclusion hand, we have to ask if the object-based classification finds all changes. A change in the landscape can only The basic idea of the approach is that image be detected if it affects a large part of an objectinterpretation is not based only on the interpretation because the object-based classification uses the exist-
236 V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 Fig. 11. (a) GIS data and (b) result of the classification. Fig. 12. (a) Input image, (b) classification with multispectral data, and (c) classification with multispectral and laser data.
V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 237 Fig. 13. Laser profiles of (a) a residential and (b) an industrial area.ing object geometry. If, for example, a forest object land-use class. The same approach could be used forhas a size of 5000 m2 and in that forest object a small water areas because water is also a land-use class thatsettlement area with 200 m2 is built up, then this can be classified very accurately in pixel-based clas-approach will fail. sification. More difficult is the situation for the land- Further techniques have to be developed in order to use classes greenland and settlement, which havecover this problem. Because forest areas can be typically an inhomogeneous appearance in a pixel-classified very accurately in pixel-based classification, based classification. Here, we suggest using a multi-it could be additionally tested whether there are large scale approach to make additional verification of theareas in a forest object that are classified to another objects (e.g., see Heipke and Straub, 1999). Fig. 14. Different gradient directions for (a) greenland, (b) forest, (c) settlement.
238 V. Walter / ISPRS Journal of Photogrammetry Remote Sensing 58 (2004) 225–238 Up to now, we can only distinguish between the ments using LIDAR and color aerial imagery. Internationalland-use classes forest, settlement, greenland, and Archives for Photogrammetry and Remote Sensing XXXII (Part 7-4-3W6), 76 – 82.water. This can be refined if more object character- Hahn, M., Stallmann, D., Staetter, C., 1996. The DPA-sensoristics are evaluated. Some possible object character- system for topographic and thematic mapping. Internationalistics are defined in this paper and have to be tested in Archives of Photogrammetry and Remote Sensing XXXIfuture work. (Part B2), 141 – 146. Heipke, C., Straub, B.-M., 1999. Relations between multi scale imagery and GIS aggregation levels for the automatic extrac- tion of vegetation areas. Proceedings of the ISPRS Joint Work-References shop on ‘‘Sensors and Mapping from Space’’, Hannover. On CD-ROM.Aplin, P., Atkinson, P., Curran, P., 1999. Per-field classification of Konecny, G., 1996. Hochauflosende Fernerkundungssensoren fur ¨ ¨ landuse using the forthcoming very fine resolution satellite sen- kartographische Anwendungen in Entwicklungslander. ZPF 64 ¨ sors: problems and potential solutions. In: Atkinson, P., Tate, N. (2), 39 – 51. (Eds.), Advances in Remote Sensing and GIS Analysis. Wiley, Pohl, C., van Genderen, J., 1998. Multisensor image fusion in Chichester, pp. 219 – 239. remote sensing: concepts, methods and applications. Interna-Arbeitsgemeinschaft der Vermessungsverwaltungen der Lander der ¨ tional Journal on Remote Sensing 19 (5), 823 – 864. Bundesrepublik Deutschland (AdV), 1988. Amtlich Topogra- Walter, V., 1998. Automatic classification of remote sensing data phisches-Kartographisches Informationssystem (ATKIS). Land- for GIS database revision. International Archives for Photo- esvermessungsamt Nordrhein-Westfalen, Bonn. grammetry and Remote Sensing XXXII (Part 4), 641 – 648.Blaschke, T., Lang, S., Lorup, E., Strobl, J., Zeil, P., 2000. Object- Walter, V., 1999. Comparison of the potential of different sensors oriented image processing in an integrated GIS/remote sensing for an automatic approach for change detection in GIS data- environment and perspectives for environmental applications. bases. Lecture Notes in Computer Science, Integrated Spatial In: Cremers, A., Greve, K. (Eds.), Environmental Information Databases: Digital Images and GIS, International Workshop for Planning, Politics and the Public, vol. II. Metropolis-Verlag, ISD ’99. Springer, Heidelberg, pp. 47 – 63. Marburg, pp. 555 – 570. Walter, V., 2000. Automatic change detection in GIS databasesCampbell, J.B., 1987. Introduction into Remote Sensing. The based on classification of multispectral data. International Guildford Press, New York. Archives of Photogrammetry and Remote Sensing XXXIIIDupas, C.A., 2000. SAR and LANDSAT TM image fusion for land (Part B4), 1138 – 1145. cover classification in the Brazilian Atlantic Forest Domain. Walter, V., Fritsch, D., 2000. Automatic verification of GIS data International Archives for Photogrammetry and Remote Sensing using high resolution multispectral data. International Archives XXXIII (Part B1), 96 – 103. of Photogrammetry and Remote Sensing XXXII (Part 3/1),Haala, N., Walter, V., 1999. Classification of urban environ- 485 – 489.