Digital image classification22oct


Published on

Digital Image Classification-
BY aleemuddin
Univ-Jamia Millia Islamia- New Delhi-INDIA

Published in: Education, Technology

Digital image classification22oct

  1. 1. iirs DIGITAL IMAGE CLASSIFICATION POONAM S. TIWARI Photogrammetry and Remote Sensing Division
  2. 2. <ul><li>􀁺 What is it and why use it? </li></ul><ul><li>􀁺 Image space versus feature space </li></ul><ul><li>􀁺 Distances in feature space </li></ul><ul><li>􀁺 Decision boundaries in feature space </li></ul><ul><li>􀁺 Unsupervised versus supervised training </li></ul><ul><li>􀁺 Classification algorithms </li></ul><ul><li>􀁺 Validation (how good is the result?) </li></ul><ul><li>􀁺 Problems </li></ul>Main lecture topics
  3. 3. <ul><li>Multispectral classification is the process of sorting pixels into a finite number of individual classes, or categories of data, based on their data file values. If a pixel satisfies a certain set of criteria , the pixel is assigned to the class that corresponds to that criteria. </li></ul><ul><li>  Multispectral classification may be performed using a variety of algorithms </li></ul><ul><li>Hard classification using supervised or unsupervised approaches. </li></ul><ul><li>  Classification using fuzzy logic, and/or </li></ul><ul><li>Hybrid approaches often involving use of ancillary information. </li></ul>􀁺 What is Digital Image Classification
  4. 4. 􀁺 What is Digital Image Classification <ul><li>grouping of similar pixels </li></ul><ul><li>separation of dissimilar ones </li></ul><ul><li>assigning class label to pixels </li></ul><ul><li>resulting in manageable size of classes </li></ul>
  5. 5. CLASSIFICATION METHODS <ul><li>MANUAL </li></ul><ul><li>visual interpretation </li></ul><ul><li>combination of spectral and spatial information </li></ul><ul><li>COMPUTER ASSISTED </li></ul><ul><li>mainly spectral information </li></ul><ul><li>STRATIFIED </li></ul><ul><li>using GIS functionality to incorporate </li></ul><ul><li>knowledge from other sources of information </li></ul>
  6. 6. Why use it? <ul><li>To translate continuous variability of image data into map patterns that provide meaning to the user. </li></ul><ul><li>To obtain insight in the data with respect to ground cover and surface characteristics. </li></ul><ul><li>To find anomalous patterns in the image data set. </li></ul>
  7. 7. Why use it? (advantages) <ul><li>Cost efficient in the analyses of large data sets </li></ul><ul><li>Results can be reproduced </li></ul><ul><li>More objective then visual interpretation </li></ul><ul><li>Effective analysis of complex multi-band (spectral) interrelationships </li></ul>
  8. 8. Dimensionality of Data <ul><li>Spectral Dimensionality is determined by the number of sets of values being used in a process. </li></ul><ul><li>In image processing, each band of data is a set of values. An image with four bands of data is said to be four-dimensional (Jensen, 1996). </li></ul>
  9. 9. Measurement Vector <ul><li>The measurement vector of a pixel is the set of data file values for one pixel in all n bands. </li></ul><ul><li>Although image data files are stored band-by-band, it is often necessary to extract the measurement vectors for individual pixels. </li></ul>
  10. 11. <ul><li>When the measurement vectors of several pixels are analyzed, a mean vector is often calculated. </li></ul><ul><li>This is the vector of the means of the data file values in each band. It has n elements. </li></ul>Mean Vector Mean Vector µ I =
  11. 12. Image space <ul><li>Image space (col,row) </li></ul><ul><li>array of elements corresponding to reflected or emitted energy from IFOV </li></ul><ul><li>spatial arrangement of the measurements of the reflected or emitted energy </li></ul>Single-band Image Multi-band Image
  12. 13. Feature Space: <ul><li>A feature space image is simply a graph of the data file values of one band of data against the values of another band. </li></ul>PIXEL A: 34,25 PIXEL B: 34,24 PIXEL C: 11,77 ANALYZING PATTERNS IN MULTISPECTRAL DATA A B C
  13. 14. One-dimensional feature space Input layer Distinction between classes No distinction between classes
  14. 15. Feature Space Multi-dimensional Feature vectors
  15. 16. Two/three dimensional graph or scattered diagram formation of clusters of points representing DN values in two/three spectral bands. Each cluster of points corresponds to a certain cover type on ground Low frequency High frequency Feature space (s cattergram )
  16. 17. Distances and clusters in feature space Euclidian distance Cluster
  17. 18. Euclidean Spectral distance is distance in n - dimensional spectral space. It is a number that allows two measurement vectors to be compared for similarity. The spectral distance between two pixels can be calculated as follows: Spectral Distance Where: D = spectral distance n = number of bands (dimensions) i = a particular band d i = data file value of pixel d in band i e i = data file value of pixel e in band i This is the equation for Euclidean distance—in two dimensions (when n = 2), it can be simplified to the Pythagorean Theorem ( c 2 = a 2 + b 2 ), or in this case: D 2 = ( d i - e i ) 2 + ( d j - e j ) 2
  18. 19. Image classification process Validation of the result Definition of the clusters in the feature space Selection of the image data 1 2 3 4 5
  19. 20. <ul><li>It is also important for the analyst to realize that there is a fundamental difference between information classes and spectral classes. </li></ul><ul><li>Information classes are those that human beings define. </li></ul><ul><li>Spectral classes are those that are inherent in the remote sensor data and must be identified and then labeled by the analyst. </li></ul>
  20. 21. SUPERVISED CLASSIFICATION : <ul><li>The identity and location of some of the land cover types such as urban, agriculture, wetland are known a priori through a combination of field work and experience. </li></ul><ul><li>The analyst attempts to locate specific sites in the remotely sensed data that represent homogenous examples of these known land cover types known as training sites. </li></ul><ul><li>Multivariate statistical parameters are calculated for these training sites. </li></ul><ul><li>Every pixel both inside and outside the training sites is evaluated and assigned to the class of which it has the highest likelihood of being a member. </li></ul>
  21. 22. Supervised image classification <ul><li>Steps in supervised classification </li></ul><ul><ul><li>Identification of sample areas (training areas) </li></ul></ul><ul><ul><li>Partitioning of the feature space </li></ul></ul><ul><li>A class sample </li></ul><ul><ul><li>Is a number of training pixels </li></ul></ul><ul><ul><li>Forms a cluster in feature space </li></ul></ul><ul><li>A cluster </li></ul><ul><ul><li>Is the representative for a class </li></ul></ul><ul><ul><li>Includes a minimum number of observations (30*n) </li></ul></ul><ul><ul><li>Is distinct </li></ul></ul>
  22. 23. UNSUPERVISED CLASSIFICATION <ul><li>The identities of land cover types to be specified as classes within a scene are generally not known a priori because ground reference information is lacking or surface features within the scene are not well defined. </li></ul><ul><li>The computer is required to group pixels with similar spectral characteristics into unique clusters according to some statistically determined criteria. </li></ul><ul><li>Analyst then combine and relabels the spectral clusters into information classes. </li></ul>
  23. 24. Supervised vs. Unsupervised Training <ul><li>In supervised training, it is important to have a set of desired classes in mind, and then create the appropriate signatures from the data. </li></ul><ul><li>Supervised classification is usually appropriate when you want to identify relatively few classes, when you have selected training sites that can be verified with ground truth data, or when you can identify distinct, homogeneous regions that represent each class. </li></ul><ul><li>On the other hand, if you want the classes to be determined by spectral distinctions that are inherent in the data so that you can define the classes later, then the application is better suited to unsupervised training. Unsupervised training enables you to define many classes easily, and identify classes that are not in contiguous, easily recognized regions. </li></ul>
  24. 25. SUPERVISED CLASSIFICATION <ul><li>In supervised training, you rely on your own pattern recognition skills and a priori knowledge of the data to help the system determine the statistical criteria (signatures) for data classification. </li></ul><ul><li>To select reliable samples, you should know some information—either spatial or spectral—about the pixels that you want to classify. </li></ul>
  25. 26. Partition of a feature space <ul><li>assign a class to each pixel </li></ul>class a class b class d class c <ul><li>decide on decision boundaries </li></ul>
  26. 27. Training Samples and Feature Space Objects <ul><li>Training samples (also called samples) are sets of pixels that represent what is recognized as a discernible pattern, or potential class. The system calculates statistics from the sample pixels to create a parametric signature for the class. </li></ul>
  27. 28. Selecting Training Samples <ul><li>Training data for a class should be collected from homogeneous environment. </li></ul><ul><li>Each site is usually composed of many pixels-the general rule is that if training data is being collected from n bands then > 10n pixels of training data is to be collected for each class. This is sufficient to compute variance-covariance matrices required by some classification algorithms. </li></ul>
  28. 29. <ul><li>There are a number of ways to collect training site data- </li></ul><ul><ul><li>using a vector layer </li></ul></ul><ul><ul><li>• defining a polygon in the image </li></ul></ul><ul><ul><li>• using a class from a thematic raster layer from an image file of the same area (i.e., the result of an unsupervised classification) </li></ul></ul>
  29. 30. Evaluating Signatures <ul><li>There are tests to perform that can help determine whether the signature data are a true representation of the pixels to be classified for each class. You can evaluate signatures that were created either from supervised or unsupervised training. </li></ul>
  30. 31. Evaluation of Signatures <ul><li>Ellipse—view ellipse diagrams and scatterplots of data file values for every pair of bands. </li></ul>
  31. 32. Evaluation of Signatures………….. <ul><li>Signature separability is a statistical measure of distance between two signatures. Separability can be calculated for any combination of bands that is used in the classification, enabling you to rule out any bands that are not useful in the results of the classification. </li></ul><ul><li>Euclidian Distance: </li></ul>Where: D = spectral distance n = number of bands (dimensions) i = a particular band d i = data file value of pixel d in band i e i = data file value of pixel e in band i
  32. 33. Signature Seperability……… <ul><li>2. Divergence </li></ul>
  33. 34. Signature Seperability……… 3. Transformed Divergence The scale of the divergence values can range from 0 to 2,000. As a general rule, if the result is greater than 1,900, then the classes can be separated. Between 1,700 and 1,900, the separation is fairly good. Below 1,700, the separation is poor (Jensen, 1996).
  34. 35. Signature Seperability……… <ul><li>4. Jeffries-Matusita Distance </li></ul>Range of JM is between 0 and 1414. The JM distance has a saturating behavior with increasing class separation like transformed divergence. However, it is not as computationally efficient as transformed divergence” (Jensen, 1996).
  35. 36. SELECTING APPROPRIATE CLASSIFICATION ALGORITHM <ul><li>Various supervised classification algorithms may be used to assign an unknown pixel to one of the classes. </li></ul><ul><li>The choice of particular classifier depends on nature of input data and output required. </li></ul><ul><li>Parametric classification algorithms assume that the observed measurement vectors Xc , obtained for each class in each spectral band during the training phase are Gaussian in nature. </li></ul><ul><li>Non Parametric classification algorithms make no such assumptions. </li></ul><ul><li>There are many classification algorithms i.e. Parallelepiped, Minimum distance, Maximum Likelihood etc. </li></ul>
  36. 37. PARALLELEPIPED CLASSIFICATION ALGORITHM <ul><li>In the parallelepiped decision rule, the data file values of the candidate pixel are compared to upper and lower limits. These limits can be either: </li></ul><ul><li>the minimum and maximum data file values of each band in the signature, </li></ul><ul><li>the mean of each band, plus and minus a number of standard deviations, or </li></ul><ul><li>any limits that you specify, based on your knowledge of the data and signatures. </li></ul><ul><li>There are high and low limits for every signature in every band. When a pixel’s data file values are between the limits for every band in a signature, then the pixel is assigned to that signature’s class. </li></ul>
  37. 38. <ul><li>Therefore, if the low and high decision boundaries are defined as </li></ul><ul><li>L ck = μ ck - S ck </li></ul><ul><li>and </li></ul><ul><li>H ck = μ ck + S ck </li></ul><ul><li>The parallelepiped algorithm becomes </li></ul><ul><li>L ck ≤ BV ijk ≤ H ck </li></ul>
  38. 39. 0 255 0 255 Band 1 Means and Standard Deviations 0 255 0 255 Band 2 Band 1 Feature Space Partitioning - Box classifier Partitioned Feature Space Band 2
  39. 40. Class “unknown”
  40. 41. Points a and b are pixels in the image to be classified. Pixel a has a brightness value of 40 in band 4 and 40 in band 5. Pixel b has a brightness value of 10 in band 4 and 40 in band 5. The boxes represent the parallelepiped decision rule associated with a ±1s classification. The vectors ( arrows ) represent the distance from a and b to the mean of all classes in a minimum distance to means classification algorithm.
  41. 43. Overlap Region <ul><li>In cases where a pixel may fall into the overlap region of two or more parallelepipeds, you must define how the pixel can be classified. </li></ul><ul><li>The pixel can be classified by the order of the signatures. </li></ul><ul><li>The pixel can be classified by the defined parametric decision rule. </li></ul><ul><li>The pixel can be left unclassified. </li></ul>
  42. 44. <ul><li>ADVANTAGES: </li></ul><ul><li>Fast and simple. </li></ul><ul><li>Gives a broad classification thus narrows down the number of possible classes to which each pixel can be assigned before more time consuming calculations are made. </li></ul><ul><li>Not dependent on normal distributions. </li></ul><ul><li>DISADVANTAGES: </li></ul><ul><li>Since parallelepiped has corners, pixels that are actually quite far, spectrally from the mean of the signature may be classified </li></ul>Parallelepiped Corners Compared to the Signature Ellipse
  43. 45. MINIMUM DISTANCE TO MEANS CLASSIFICATION ALGORITHM <ul><li>This decision rule is computationally simple and commonly used. </li></ul><ul><li>Requires mean vectors for each class in each band μ ck from the training data. </li></ul><ul><li>Euclidean distance is calculated for all the pixels with all the signature means </li></ul><ul><li>D = √ (BV ijk - μ ck ) 2 + (BV ijl - μ cl ) 2 </li></ul><ul><li>Where </li></ul><ul><li>μ ck and μ cl represent the mean vectors for class c measured in bands k and l </li></ul><ul><li>Any unknown pixel will definitely be assigned to one of any classes, there will be no unclassified pixel. </li></ul>
  44. 46. MINIMUM DISTANCE TO MEANS Decision rule: Priority to the shortest distance to the class mean 0 31 63 95 127 159 191 223 255 300 200 100 0 Histogram of training set
  45. 47. Feature Space Partitioning - Minimum Distance to Mean Classifier 0 255 0 255 Band 2 Band 1 255 0 255 Band 2 Band 1 0 0 255 Band 2 Band 1 Mean vectors 0 255 &quot;Unknown&quot; Threshold Distance
  46. 48. <ul><li>ADVANTAGES: </li></ul><ul><li>Since every pixel is spectrally closer to either one sample mean or other so there are no unclassified pixels. </li></ul><ul><li>Fastest after parallelepiped decision rule. </li></ul><ul><li>DISADVANTAGES: </li></ul><ul><li>Pixels which should be unclassified will become classified. </li></ul><ul><li>Does not consider class variability. </li></ul>
  47. 49. Mahalanobis Decision Rule <ul><li>Mahalanobis distance is similar to minimum distance, except that the covariance matrix is used in the equation. Variance and covariance are figured in so that clusters that are highly varied lead to similarly varied classes, </li></ul>
  48. 50. <ul><li>Advantages </li></ul><ul><li>Takes the variability of classes into account, unlike minimum distance or parallelepiped. </li></ul><ul><li>May be more useful than minimum distance in cases where statistical criteria (as expressed in the covariance matrix) must be taken into account </li></ul><ul><li>Disadvantages </li></ul><ul><li>Tends to overclassify signatures with relatively large values in the covariance matrix. </li></ul><ul><li>Slower to compute than parallelepiped or minimum distance. </li></ul><ul><li>Mahalanobis distance is parametric, meaning that it relies heavily on a normal distribution of the data in each input band. </li></ul>
  49. 51. Maximum Likelihood/Bayesian Decision Rule <ul><li>The maximum likelihood decision rule is based on the probability that a pixel belongs to a particular class. The basic equation assumes that these probabilities are equal for all classes, and that the input bands have normal distributions. </li></ul><ul><li>If you have a priori knowledge that the probabilities are not equal for all classes, you can specify weight factors for particular classes. This variation of the maximum likelihood decision rule is known as the Bayesian decision rule (Hord, 1982). </li></ul>
  50. 52. <ul><li>The equation for the maximum likelihood/Bayesian classifier is as follows: </li></ul>The pixel is assigned to the class, c , for which D is the lowest.
  51. 54. <ul><li>Advantages </li></ul><ul><li>The most accurate of the classifiers (if the input samples/clusters have a normal distribution), because it takes the most variables into consideration. </li></ul><ul><li>Takes the variability of classes into account by using the covariance matrix, as does Mahalanobis distance. </li></ul><ul><li>Disadvantages </li></ul><ul><li>An extensive equation that takes a long time to compute. The computation time increases with the number of input bands. </li></ul><ul><li>Maximum likelihood is parametric, meaning that it relies heavily on a normal distribution of the data in each input band. </li></ul><ul><li>Tends to overclassify signatures with relatively large values in the covariance matrix. </li></ul>
  52. 55. UNSUPERVISED CLASSIFICATION <ul><li>It requires only a minimum amount of initial input from the analyst. </li></ul><ul><li>Numerical operations are performed that search for natural groupings of the spectral properties of pixels. </li></ul><ul><li>User allows computer to select the class means and covariance matrices to be used in the classification. </li></ul><ul><li>Once the data are classified, the analyst attempts a posteriori to assign these natural or spectral classes to the information classes of interest. </li></ul><ul><li>Some clusters may be meaningless because they represent mixed classes. </li></ul><ul><li>Clustering algorithm used for the unsupervised classification generally vary according to the efficiency with which the clustering takes place. </li></ul><ul><li>Two commonly used methods are- </li></ul><ul><ul><li>Chain method </li></ul></ul><ul><ul><li>Isodata clustering </li></ul></ul>
  53. 56. CHAIN METHOD <ul><li>Operates in two pass mode( it passes through the registered multispectral dataset two times). </li></ul><ul><li>In the first pass the program reads through the dataset and sequentially builds clusters. </li></ul><ul><li>A mean vector is associated with each cluster. </li></ul><ul><li>In the second pass a minimum distance to means classification algorithm is applied to whole dataset on a pixel by pixel basis whereby each pixel is assigned to one of the mean vectors created in pass 1. </li></ul><ul><li>The first pass automatically creates the cluster signatures to be used by supervised classifier. </li></ul>
  54. 57. <ul><li>PASS 1: CLUSTER BUILDING </li></ul><ul><li>During the first pass the analyst is required to supply four types of information- </li></ul><ul><li>R , the radius distance in spectral space used to determine when a new cluster should be formed. </li></ul><ul><li>C, a spectral space distance parameter used when merging clusters when N is reached. </li></ul><ul><li>N , the number of pixels to be evaluated between each major merging of clusters. </li></ul><ul><li>C max maximum no. of clusters to be identified. </li></ul><ul><li>PASS 2: Assignment of pixels to one of the Cmax clusters using minimum distance classification logic </li></ul>
  55. 58. Original brightness values of pixels 1, 2, and 3 as measured in Bands 4 and 5 of the hypothetical remote sensed data.
  56. 59. The distance ( D ) in 2-dimensional spectral space between pixel 1 (cluster 1) and pixel 2 (potential cluster 2) in the first iteration is computed and tested against the value of R =15, the minimum acceptable radius. In this case, D does not exceed R . Therefore, we merge clusters 1 and 2 as shown in the next illustration.
  57. 60. Pixels 1 and 2 now represent cluster #1. Note that the location of cluster 1 has migrated from 10,10 to 15,15 after the first iteration. Now, pixel 3 distance (D=15.81) is computed to see if it is greater than the minimum threshold, R=15. It is, so pixel location 3 becomes cluster #2. This process continues until all 20 clusters are identified. Then the 20 clusters are evaluated using a distance measure, C (not shown), to merge the clusters that are closest to one another.
  58. 61. How clusters migrate during the several iterations of a clustering algorithm. The final ending point represents the mean vector that would be used in phase 2 of the clustering process when the minimum distance classification is performed.
  59. 62. <ul><li>Some clustering algorithms allow the analyst to initially seed the mean vector for several of the important classes. The seed data are usually obtained in a supervised fashion, as discussed previously. Others allow the analyst to use a priori information to direct the clustering process. </li></ul><ul><li>Note: As more points are added to a cluster, the mean shifts less dramatically since the new computed mean is weighted by the number of pixels currently in a cluster. The ending point is the spectral location of the final mean vector that is used as a signature in the minimum distance classifier applied in pass 2 . </li></ul>
  60. 63. Pass 2: Assignment of Pixels to One of the C max Clusters Using Minimum Distance Classification Logic The final cluster mean data vectors are used in a minimum distance to means classification algorithm to classify all the pixels in the image into one of the C max clusters.
  61. 64. ISODATA Clustering The Iterative Self-Organizing Data Analysis Technique (ISODATA) represents a comprehensive set of heuristic (rule of thumb) procedures that have been incorporated into an iterative classification algorithm. The ISODATA algorithm is a modification of the k -means clustering algorithm, which includes a) merging clusters if their separation distance in multispectral feature space is below a user-specified threshold and b) rules for splitting a single cluster into two clusters. ISODATA is iterative because it makes a large number of passes through the remote sensing dataset until specified results are obtained, instead of just two passes. ISODATA does not allocate its initial mean vectors based on the analysis of pixels rather, an initial arbitrary assignment of all Cmax clusters takes place along an n -dimensional vector that runs between very specific points in feature space.
  62. 65. <ul><li>ISODATA algorithm normally requires the analyst to specify- </li></ul><ul><ul><li>Cmax : maximum no. of clusters to be identified. </li></ul></ul><ul><ul><li>T :maximum % of pixels whose class values are allowed to be unchanged between iterations. </li></ul></ul><ul><ul><li>M :maximum no. of times isodata is to classify pixels and recalculate cluster mean vectors. </li></ul></ul><ul><ul><li>Minimum members in a cluster </li></ul></ul><ul><ul><li>Maximum standard deviation for a cluster. </li></ul></ul><ul><ul><li>Split separation value ( if the valuse is changed from 0.0, it takes the place of S.D. ) </li></ul></ul><ul><ul><li>Minimum distance between cluster means. </li></ul></ul>
  63. 66. Phase 1: ISODATA Cluster Building using many passes through the dataset . <ul><li>ISODATA initial distribution of five hypothetical mean vectors using ±1s standard deviations in both bands as beginning and ending points. </li></ul><ul><li>In the first iteration, each candidate pixel is compared to each cluster mean and assigned to the cluster whose mean is closest in Euclidean distance. </li></ul><ul><li>During the second iteration, a new mean is calculated for each cluster based on the actual spectral locations of the pixels assigned to each cluster, instead of the initial arbitrary calculation. This involves analysis of several parameters to merge or split clusters. After the new cluster mean vectors are selected, every pixel in the scene is assigned to one of the new clusters. </li></ul><ul><li>This split–merge–assign process continues until there is little change in class assignment between iterations (the T threshold is reached) or the maximum number of iterations is reached ( M ). </li></ul>
  64. 67. <ul><li>Distribution of 20 ISODATA mean vectors after just one iteration </li></ul><ul><li>Distribution of 20 ISODATA mean vectors after 20 iterations. The bulk of the important feature space (the gray background) is partitioned rather well after just 20 iterations. </li></ul>
  65. 68. <ul><li>Sources of Uncertainty in Image Classification </li></ul><ul><li>Non-representative training areas </li></ul><ul><li>2. High variability in the spectral signatures for a land cover class </li></ul><ul><li>3. Mixed land cover within the pixel area </li></ul>
  66. 69. Evaluating Classification <ul><li>After a classification is performed, these methods are available for testing the accuracy of the classification: </li></ul><ul><li>Thresholding —Use a probability image file to screen out misclassified pixels. </li></ul><ul><li>Accuracy Assessment —Compare the classification to ground truth or other data. </li></ul>
  67. 70. Accuracy Assessment <ul><li>Accuracy assessment is a general term for </li></ul><ul><li>comparing the classification to geographical data </li></ul><ul><li>that are assumed to be true, in order to determine </li></ul><ul><li>the accuracy of the classification process. Usually, </li></ul><ul><li>the assumed-true data are derived from ground </li></ul><ul><li>truth data. </li></ul>
  68. 71. Accuracy Assesement…… <ul><li>Assessing accuracy of a remote sensing output is one of the most important steps in any classification exercise!! </li></ul><ul><li>Without an accuracy assessment the output or results is of little value. </li></ul>
  69. 72. There are a number of issues relevant to the generation and assessment of errors in a classification. These include: • the nature of the classification ; • Sample design and • assessment sample size .
  70. 73. Nature of Classification: <ul><li>– 1) Class definition problems occur when trying to extract information from a image, such as tree height, which is unrealistic. If this happens the error rate will increase. </li></ul><ul><li>– 2) A common problem is classifying remotely sensed data is to use inappropriate class labels , such as cliff, lake or river all of which are landforms and not cover-types. Similarly a common error is that of using class labels which define land-uses. These features are commonly made up of several cover classes. </li></ul><ul><li>– 3) The final point here, in terms of the potential for generation of error is the mislabeling of classes . The most obvious example of this is to label a training site water when in fact it is something else. This will result in, at best a skewing of your class statistics if your training site samples are sufficiently large, or at worst shifting the training statistics entirely if your sites are relatively small. </li></ul>
  71. 74. Sample Design: <ul><li>• In addition to being independent of the original training sample the sample used must be of a design that will insure consistency and objectivity. </li></ul><ul><li>• A number of sampling techniques can be used. Some of these include random, systematic , and stratified random . </li></ul><ul><li>• Of the three the systematic sample is the least useful. This approach to sampling may result in a sample distribution which favours a particular class depending on the distribution of the classes within the map. </li></ul><ul><li>• Only random sample designs can guarantee an unbiased sample. </li></ul><ul><li>• The truly random strategy however may not yield a sample design that covers the entire map area, and so may be less than ideal. </li></ul><ul><li>• In many instances the stratified random sampling strategy is the most useful tool to use. In this case the map area is stratified based on either a systematic breakdown followed by a random sample design in each of the systematic subareas, or alternatively through the application of a random sample within each of the map classes. The use of this approach will ensure that one has an adequate cover for the entire map as well as generating a sufficient number of samples for each of the classes on the map </li></ul>
  72. 75. Sample Size: <ul><li>• The size of the sample used must be sufficiently large to be statistically representative of the map area. The number of points considered necessary varies, depending on the method used to estimate. </li></ul><ul><li>• What this means is that when using a systematic or random sample size, the number of points are kept to a manageable number. Because the number of points contained within a stratified area is usually high, that is greater than 10000, the number of samples used to test the accuracy of the classes through a stratified random sample will be high as well, so the cost for using a highly accurate sampling strategy is a large number of samples </li></ul>
  73. 76. <ul><li>Once a classification has been sampled a contingency table (also referred to as an error matrix or confusion matrix) is developed. </li></ul><ul><li>• This table is used to properly analyze the validity of each class as well as the classification as a whole. </li></ul><ul><li>• In this way the we can evaluate in more detail the efficacy of the classification. </li></ul>ERROR MATRIX
  74. 77. <ul><li>One way to assess accuracy is to go out in the field and observe the actual land class at a sample of locations, and compare to the land classification it was assigned on the thematic map. </li></ul><ul><li>• There are a number of ways to quantitatively express the amount of agreement between the ground truth classes and the remote sensing classes. </li></ul><ul><li>• One way is to construct a confusion error matrix, alternatively called a error matrix </li></ul><ul><li>• This is a row by column table, with as many rows as columns. </li></ul><ul><li>Each row of the table is reserved for one of the information, or remote sensing classes used by the classification algorithm. </li></ul><ul><li>Each column displays the corresponding ground truth classes in an identical order. </li></ul>
  75. 78. <ul><li>The diagonal elements tally the number of pixels classified correctly in each class. </li></ul><ul><li>But just because 83% classifications were accurate overall, does not mean that each category was successfully classified at that rate. </li></ul>OVERALL ACCURACY
  76. 79. USERS ACCURACY <ul><li>• A user of the imagery who is particularly interested in class A, say, might wish to know what proportion of pixels assigned to class A were correctly assigned. </li></ul><ul><li>• In this example 35 of the 39 pixels were correctly assigned to class A, and the user accuracy in this category of 35/39 = 90% </li></ul>
  77. 80. <ul><li>In general terms, for a particular category is user accuracy computed as: </li></ul><ul><li>• which, for an error matrix set up with the row and column assignments as stated, is computed as the user accuracy </li></ul><ul><li>• Evidently, a user accuracy can be computed for each row. </li></ul>
  78. 81. PRODUCERS ACCURACY <ul><li>Contrasted to user accuracy is producer accuracy , which has a slightly different interpretation. </li></ul><ul><li>• Producers accuracy is a measure of how much of the land in each category was classified correctly. </li></ul><ul><li>• It is found, for each class or category, as </li></ul>The Producer’s accuracy for class A is 35/50 = 70%
  79. 82. <ul><li>So from this assessment we have three measures of accuracy which address subtly different issues: </li></ul><ul><li>– overall accuracy : takes no account of source of error </li></ul><ul><li>(errors of omission or commission) </li></ul><ul><li>– user accuracy : measures the proportion of each TM class which is correct. </li></ul><ul><li>– producer accuracy : measures the proportion of the land base which is correctly classified. </li></ul>
  80. 83. KAPPA COEFFICENT <ul><li>• Another measure of map accuracy is the kappa coefficient, which is a measure of the proportional (or percentage) improvement by the classifier over a purely random assignment to classes. </li></ul><ul><li>For an error matrix with r rows, and hence the same number of columns, let – A = the sum of r diagonal elements, which is the numerator in the computation of overall accuracy </li></ul><ul><li>– Let B = sum of the r products (row total x column total). </li></ul><ul><li>• Then </li></ul><ul><li>• where N is the number of pixels in the error matrix </li></ul><ul><li>(the sum of all r individual cell values). </li></ul>
  81. 84. <ul><li>For the above error matrix, </li></ul><ul><li>– A = 35 + 37 + 41 = 113 </li></ul><ul><li>– B = (39 * 50) + (50 * 40) + (47 * 46) = 6112 </li></ul><ul><li>– N = 136 </li></ul><ul><li>Thus </li></ul><ul><li>This can be tested statistically. </li></ul>
  82. 85. Thank You