CHAPTER FIVE
IMAGE CLASSIFICATION AND ANALYSIS
• A human analyst attempting to classify features in an image
uses the elements of visual Interpretation to identify
homogeneous groups of pixels which represent various features
or land cover classes of interest.
• Spectral pattern recognition - uses the spectral information
represented by the digital numbers in one or more spectral
bands, and attempts to classify each individual pixel based on
this spectral information.
12/24/2019
1
CONT…
Information classes – are those categories of interest that the analyst is
actually trying to identify in the imagery, such as different kinds of
crops, different forest types or tree species, different geologic units or
rock types, etc.
Spectral classes - are groups of pixels that are uniform (or near-similar)
with respect to their brightness values in the different spectral channels
of the data.
 The objective is to match the spectral classes in the data to the
information classes of interest.
12/24/2019 2
CONT…
12/24/2019 3
Common classification procedures can be
broken down into two broad subdivisions based
on the method used:
I. SUPERVISED CLASSIFICATION
II. UNSUPERVISED CLASSIFICATION
III. HYBRID CLASSIFICATION
CONT…
12/24/2019 4
SUPERVISED CLASSIFICATION
 In a supervised classification, the analyst identifies homogeneous representative
samples (referred to as training areas) of the different surface cover types (information
classes) of interest in the imagery.
 The selection of appropriate training areas is based on
 the analyst’s familiarity with the geographical area and
 knowledge of the actual surface cover types present in the image.
 Thus, the analyst is supervising the categorization of a set of specific classes.
 The numerical information in all spectral bands for the pixels comprising these areas,
are used to train the computer to recognize spectrally similar areas for each class.
 The computer uses special programs or algorithms to determine the numerical
signatures for each training class.
12/24/2019 5
Once the computer has determined the signatures for each class,
each pixel in the image is compared to these signatures and labeled as
the class it closely resembles digitally.
Thus, in a supervised classification, the analyst is first identifies the
information classes based on which it determines the spectral classes
which represent them.
In order to carry out supervised classification the analyst may have to
adopt a well defined procedure in so as to achieve a satisfactory
classification of information
CONT…
12/24/2019 6
STEPS REQUIRED FOR SUPERVISED CLASSIFICATION
(a) Firstly,
 acquire satellite data and accompanying metadata.
 Look for information regarding platform, projection, resolution, coverage,
and, importantly, meteorological conditions before and during data
acquisition.
(b) Secondly,
 Chose the surface types to be mapped. Collect ground truth data with
positional accuracy (GPS).
 These data are used to develop the training classes for the discriminant
analysis.
 Ideally, it is best to time the ground truth data collection to coincide with
the satellite passing overhead.
12/24/2019 7
CONT…
(c)Thirdly,
 begin the classification by performing image post-
processing techniques (corrections, image mosaics, and
enhancements).
 Select pixels in the image that are representative (and
homogeneous) of the object.
 If GPS field data were collected, geo-register the GPS field
plots onto the imagery and define the image training sites by
outlining the GPS polygons. A training class contains the sum
of points (pixels) or polygons (clusters of pixels)
12/24/2019 8
The success of a supervised classification depends upon the training data used to
identify different classes.
Hence selection of training data has to be done meticulously keeping in mind each
training data set has some specific characteristics.
These characteristics are discussed below:
Number of pixels: This is an important characteristic regarding the number of
pixels to be selected for each information class.
However, there is no guideline available, yet in general, the analyst
must ensure that sufficient number of pixels is selected.
One training area has a MMU of about 4 hectares
100 pixels per class which also should have to govern the MMU.
(10×Bands to 100×Bands) pixels per class.
1% of the total image could be the whole class samples in an image
Size: The training sets identified on the image should be large enough to provide
accurate and reliable information regarding the informational class.
Instead of selecting larger samples in a limited area of the whole image, it is
more reliable to select small samples, large in number, in a uniform distribution
throughout the image.
CONT…
12/24/2019 9
Shapes
Shape: It is not an important
characteristic.
However, regular shape of
training area selected provides
ease in extracting the
information from the satellite
images.
CONT…
12/24/2019 10
Placement: The training area should be placed in such a
way that it does not lie close to the edge of the boundary
of the information class.
Uniformity: This is one the most critical and important
characteristics of any training data for an information class.
The training data collected must exhibit uniformity or
homogeneity in the information.
If the histogram displays one peak, i.e., unimodal
frequency distribution for each spectral class, the
training data is acceptable.
If the display is multimodal distribution, then there is
variability or mixing of information and hence must be
discarded.
CONT…
12/24/2019 11
 Location: Generally informational classes have small spectral variability, thus it is
necessary that training data are should be so located that it accounts for different
types of conditions within the image.
 It is desirable that the analyst undertakes a field visit to the desired location
to clearly mark out the selected information.
 In case of inaccessible or mountainous regions, aerial photographs or maps can
provide the basis for accurate delineation of training areas.
 Number of training areas: The number of training areas depends upon the number
of categories to be mapped, their diversity, and the resources available for
delineating training areas.
 In general, five to ten training samples per class are selected in order to
account for the spatial and spectral variability of informational class.
 Selection of multiple training areas is also desirable as it may be possible that some
training areas of a class may have to be discarded later.
CONT…
12/24/2019 12
CONT…
12/24/2019 13
CONT…
12/24/2019 14
1. Minimum Mean Distance –
 Minimum distance to the mean is a simple computation that classifies pixels
based on their distance from the mean of the training class.
 It is determined by plotting the pixel brightness and calculating its Euclidean
distance (using the Pythagorean theorem) to the unassigned pixel.
Some Classification Algorithms
Where
μck and μcl represent the mean vectors for class c measured in bands k and l
 The unknown pixel is assigned to the closest class.
 Any unknown pixel will definitely be assigned to one of any classes, there will
be no unclassified pixel.
1512/24/2019
 Pixels are assigned to the training class for which it has a
minimum distance.
 The user designates a minimum distance threshold for an
acceptable distance; pixels with distance values above the
designated threshold will be classified as unknown.
Data Minimum Distance to Mean ClassificationScatter plot Training
1612/24/2019
ADVANTAGES
 Since every pixel is spectrally closer to either one sample mean or other so
there are no unclassified pixels.
 Mathematically simple and computationally faster.
DISADVANTAGES
 Pixels which should be unclassified will become classified.
 Does not consider class variability or in sensitive to the different degree of class
variance in spectral data
1712/24/2019
2. Maximum Likelihood –
 Maximum Likelihood is computationally complex.
 It establishes the variance and covariance about the mean of the
training classes.
 This algorithm then statistically calculates the probability of an
unassigned pixel belonging to each class.
 The pixel is then assigned to the class for which it has the highest
probability.
1812/24/2019
Equal Probability Contours are plotted
Probability of belonging to a class decreases with distance from it’s
mean point 1912/24/2019
ADVANTAGES
 Most accurate of classifiers (if input sample have normal distribution)
because it takes the most variables.
 Takes variability of classes into account.
DISADVANTAGES
 An extensive equation, takes long time to compute.
 It is parametric.
 Tends to over classify signatures with relatively large values in the
covariance matrix.
2012/24/2019
UNSUPERVISED CLASSIFICATION
In essence, it is reverse of the supervised
classification process.
Spectral classes are grouped, first, based solely on
the numerical information in the data, and are then
matched by the analyst to information classes (if
possible).
Programs called clustering algorithms are used to
determine the natural groupings or structures in the
data.
Usually, the analyst specifies how many groups or
clusters are to be looked for in the data. 12/24/2019 21
CONT…
Some clusters have been broken down, each of
these require a further iteration of the clustering
algorithm.
Thus, unsupervised classification is not completely
without human intervention.
However, it does not start with a pre-determined
set of classes as in a supervised classification.
12/24/2019 22
Unsupervised classification
2312/24/2019
Unsupervised classification
2412/24/2019
2512/24/2019
Steps Required for Unsupervised Classification
1) The number of classes,
2) The maximum number of iterations,
3) The maximum number of times a pixel can be moved from one
cluster to another with each iteration,
4) The minimum distance from the mean, and
5) The maximum standard deviation allowable.
2612/24/2019
2712/24/2019
Unsupervised and Supervised Classification
2812/24/2019
29
ADVANCED IMAGE PROCESSING
SOFT (SUB-PIXEL) CLASSIFICATION
 Mixed Pixel is one of the characteristic of remote sensing datasets
 Additionally, pixels containing more than one landuse-land cover classes
also need to be examined
 The spectral reflectance of a mixed pixel is an average of the use land
cover classes in it.
 There are many algorithms developed to classify mixed pixel having
different classes in it, some and most used are:
 MLC in fuzzy form
 Linear Mixture Modelling
 Fuzzy C-Mean Classifier
 Neural Networks Classifier 12/24/2019
30
ADVANCED IMAGE PROCESSING
 A Neural Networks is a massively parallel distributed processor made up of simple
processing units, which has a natural tendency for storing experimental knowledge and
making it available for use. It resembles brain in two respects:
 Knowledge is acquired b the network from its environment through a learning process
 Interneuron connection strengths, known as synaptic weights, are used to store the
acquired knowledge
 An important characteristic of Artificial Neural Networks (ANN) are:
 Their non-parametric nature, which assumes no priori knowledge, particularly of the
frequency distribution of the data.
 Their adaptability and their ability to produce results with classification accuracies,
that are higher than those generated by statistical classifiers,
 Capabilities in handling complex datasets leading to an increasing amount of research
in the remote sensing field (Paola and Schowengerdt 1995b, Atkinson and Tatnall
1997).
Neural Networks Classifier
12/24/2019
31
ADVANCED IMAGE PROCESSING
 The rapid uptake of neural approaches in remote sensing is due mainly to their widely demonstrated ability to:
 perform more accurately and rapidly than other techniques such as statistical classifiers, particularly
when the feature space is complex and the source data has nonlinear/different statistical distributions
 incorporate a priori knowledge and realistic physical constraints into the analysis (Brown and Harris 1994);
 incorporate different types of data (including those from different sensors) into the analysis, thus
facilitating synergistic studies (Benediktsson et al. 1993, Benediktsson and Sveinsson 1997);
 process with no underlying model that assumed for the multivariate distribution of the specific data in
feature space, i.e they are distribution-free;
 it does not require the data to follow the sequence of probability density function.
 adapt non-linearity and can be conceived as a complex mathematical function that converts input data (e.g.,
landuse parameters such as slope, buffer distance, population density etc) to a desired output (e.g.,
growth prediction).
 has better accuracy and evaluate the training and testing match before prediction.
 Once the training addresses the testing requirement with the required level of accuracy, the prediction
could be performed to achieve higher accuracy
12/24/2019
32
ADVANCED IMAGE PROCESSING
NEURAL NETWORKS ARCHITECTURE
Synapses/Connecting Links Neurons/Nodes
1
11W
1
12W
1
13W
2
11W
2
21W
2
31W
2
12W
2
22W
2
32W
1
21W
1
22W
1
23W
Weight associated with the path connecting the jth element of the ith layer to the kth element of the (i+1)th layer.
i
jkW
12/24/2019
33
ADVANCED IMAGE PROCESSING
Input Neurons
 Number of Input Neuron is determined based on the number of Bands,
Hidden Neurons
 The number of nodes in the hidden layers defines the complexity and power of the
neural network model to delineate underlying relationships and structures inherent
in a dataset.
 The number of hidden layer nodes has a considerable effect on both classification
accuracy and training time requirements.
 The level of classification accuracy that can be produced by a neural network is
related to the generalization capabilities of that network,
 That the number of nodes in the hidden layer(s) should be large enough for the
correct representation of the problem,
Parameters Affecting Neural Networks Classification
1. Number of Neurons/Nodes
I. Architectural Parameters
12/24/2019
34
ADVANCED IMAGE PROCESSING
 But at the same time low enough to have adequate generalization capabilities.
 Networks that are too small cannot identify the internal structure of the data (a state
known as under fitting) and therefore produce lower classification accuracies.
 Networks that are too large are Back propagating artificial neural networks in land cover
classification likely to become over specific to the training data.
 Number of Neurons in the Hidden Layers are dependent on the number of nodes/ Neurons
in the input layer and summarized as follows:
where Ni represents the number of input nodes.
12/24/2019
35
ADVANCED IMAGE PROCESSING
Ni is the number of input units, NW is the total number of weights in the network,
1. Number of training samples
 The number of training samples employed at the learning stage has a significant impact on the
performance of any classifier.
 Although the size of the training data is of considerable importance, the characteristics and
the distributions of the data, as well as the sampling strategy used, can affect the results.
 Heuristics proposed for the computation of the optimum number of training samples:
Output Neurons
 Number of Neurons in the Output Layers are dependent on the number of classes required from the
remote sensor datasets.
II. Training Parameters
12/24/2019
36
ADVANCED IMAGE PROCESSING
2. Initial of Weights
 The initial weight values define a starting location on the multi-dimensional error
surface
 When large initial values are assigned to the weights, it is likely that causes the learning
process to slow down
 If the initial weight values are assigned small values, the backpropagation algorithm may
operate on a very flat area around the origin of the error surface
3. Learning Rate and Momentum Factors
 The learning rate determines the size of the steps taken in the search for the global
minimum of the error function in the training process
 The momentum term uses the previous weight configuration to determine the direction
of search for the global minimum of the error
4. Training Iterations
 The model compute through its algorithm to classify until
 The error goal required is achieved or
 The Maximum iteration is met
II. Training Parameters
12/24/2019

DIGITAL IMAGE ANALYSIS

  • 1.
    CHAPTER FIVE IMAGE CLASSIFICATIONAND ANALYSIS • A human analyst attempting to classify features in an image uses the elements of visual Interpretation to identify homogeneous groups of pixels which represent various features or land cover classes of interest. • Spectral pattern recognition - uses the spectral information represented by the digital numbers in one or more spectral bands, and attempts to classify each individual pixel based on this spectral information. 12/24/2019 1
  • 2.
    CONT… Information classes –are those categories of interest that the analyst is actually trying to identify in the imagery, such as different kinds of crops, different forest types or tree species, different geologic units or rock types, etc. Spectral classes - are groups of pixels that are uniform (or near-similar) with respect to their brightness values in the different spectral channels of the data.  The objective is to match the spectral classes in the data to the information classes of interest. 12/24/2019 2
  • 3.
  • 4.
    Common classification procedurescan be broken down into two broad subdivisions based on the method used: I. SUPERVISED CLASSIFICATION II. UNSUPERVISED CLASSIFICATION III. HYBRID CLASSIFICATION CONT… 12/24/2019 4
  • 5.
    SUPERVISED CLASSIFICATION  Ina supervised classification, the analyst identifies homogeneous representative samples (referred to as training areas) of the different surface cover types (information classes) of interest in the imagery.  The selection of appropriate training areas is based on  the analyst’s familiarity with the geographical area and  knowledge of the actual surface cover types present in the image.  Thus, the analyst is supervising the categorization of a set of specific classes.  The numerical information in all spectral bands for the pixels comprising these areas, are used to train the computer to recognize spectrally similar areas for each class.  The computer uses special programs or algorithms to determine the numerical signatures for each training class. 12/24/2019 5
  • 6.
    Once the computerhas determined the signatures for each class, each pixel in the image is compared to these signatures and labeled as the class it closely resembles digitally. Thus, in a supervised classification, the analyst is first identifies the information classes based on which it determines the spectral classes which represent them. In order to carry out supervised classification the analyst may have to adopt a well defined procedure in so as to achieve a satisfactory classification of information CONT… 12/24/2019 6
  • 7.
    STEPS REQUIRED FORSUPERVISED CLASSIFICATION (a) Firstly,  acquire satellite data and accompanying metadata.  Look for information regarding platform, projection, resolution, coverage, and, importantly, meteorological conditions before and during data acquisition. (b) Secondly,  Chose the surface types to be mapped. Collect ground truth data with positional accuracy (GPS).  These data are used to develop the training classes for the discriminant analysis.  Ideally, it is best to time the ground truth data collection to coincide with the satellite passing overhead. 12/24/2019 7
  • 8.
    CONT… (c)Thirdly,  begin theclassification by performing image post- processing techniques (corrections, image mosaics, and enhancements).  Select pixels in the image that are representative (and homogeneous) of the object.  If GPS field data were collected, geo-register the GPS field plots onto the imagery and define the image training sites by outlining the GPS polygons. A training class contains the sum of points (pixels) or polygons (clusters of pixels) 12/24/2019 8
  • 9.
    The success ofa supervised classification depends upon the training data used to identify different classes. Hence selection of training data has to be done meticulously keeping in mind each training data set has some specific characteristics. These characteristics are discussed below: Number of pixels: This is an important characteristic regarding the number of pixels to be selected for each information class. However, there is no guideline available, yet in general, the analyst must ensure that sufficient number of pixels is selected. One training area has a MMU of about 4 hectares 100 pixels per class which also should have to govern the MMU. (10×Bands to 100×Bands) pixels per class. 1% of the total image could be the whole class samples in an image Size: The training sets identified on the image should be large enough to provide accurate and reliable information regarding the informational class. Instead of selecting larger samples in a limited area of the whole image, it is more reliable to select small samples, large in number, in a uniform distribution throughout the image. CONT… 12/24/2019 9
  • 10.
    Shapes Shape: It isnot an important characteristic. However, regular shape of training area selected provides ease in extracting the information from the satellite images. CONT… 12/24/2019 10
  • 11.
    Placement: The trainingarea should be placed in such a way that it does not lie close to the edge of the boundary of the information class. Uniformity: This is one the most critical and important characteristics of any training data for an information class. The training data collected must exhibit uniformity or homogeneity in the information. If the histogram displays one peak, i.e., unimodal frequency distribution for each spectral class, the training data is acceptable. If the display is multimodal distribution, then there is variability or mixing of information and hence must be discarded. CONT… 12/24/2019 11
  • 12.
     Location: Generallyinformational classes have small spectral variability, thus it is necessary that training data are should be so located that it accounts for different types of conditions within the image.  It is desirable that the analyst undertakes a field visit to the desired location to clearly mark out the selected information.  In case of inaccessible or mountainous regions, aerial photographs or maps can provide the basis for accurate delineation of training areas.  Number of training areas: The number of training areas depends upon the number of categories to be mapped, their diversity, and the resources available for delineating training areas.  In general, five to ten training samples per class are selected in order to account for the spatial and spectral variability of informational class.  Selection of multiple training areas is also desirable as it may be possible that some training areas of a class may have to be discarded later. CONT… 12/24/2019 12
  • 13.
  • 14.
  • 15.
    1. Minimum MeanDistance –  Minimum distance to the mean is a simple computation that classifies pixels based on their distance from the mean of the training class.  It is determined by plotting the pixel brightness and calculating its Euclidean distance (using the Pythagorean theorem) to the unassigned pixel. Some Classification Algorithms Where μck and μcl represent the mean vectors for class c measured in bands k and l  The unknown pixel is assigned to the closest class.  Any unknown pixel will definitely be assigned to one of any classes, there will be no unclassified pixel. 1512/24/2019
  • 16.
     Pixels areassigned to the training class for which it has a minimum distance.  The user designates a minimum distance threshold for an acceptable distance; pixels with distance values above the designated threshold will be classified as unknown. Data Minimum Distance to Mean ClassificationScatter plot Training 1612/24/2019
  • 17.
    ADVANTAGES  Since everypixel is spectrally closer to either one sample mean or other so there are no unclassified pixels.  Mathematically simple and computationally faster. DISADVANTAGES  Pixels which should be unclassified will become classified.  Does not consider class variability or in sensitive to the different degree of class variance in spectral data 1712/24/2019
  • 18.
    2. Maximum Likelihood–  Maximum Likelihood is computationally complex.  It establishes the variance and covariance about the mean of the training classes.  This algorithm then statistically calculates the probability of an unassigned pixel belonging to each class.  The pixel is then assigned to the class for which it has the highest probability. 1812/24/2019
  • 19.
    Equal Probability Contoursare plotted Probability of belonging to a class decreases with distance from it’s mean point 1912/24/2019
  • 20.
    ADVANTAGES  Most accurateof classifiers (if input sample have normal distribution) because it takes the most variables.  Takes variability of classes into account. DISADVANTAGES  An extensive equation, takes long time to compute.  It is parametric.  Tends to over classify signatures with relatively large values in the covariance matrix. 2012/24/2019
  • 21.
    UNSUPERVISED CLASSIFICATION In essence,it is reverse of the supervised classification process. Spectral classes are grouped, first, based solely on the numerical information in the data, and are then matched by the analyst to information classes (if possible). Programs called clustering algorithms are used to determine the natural groupings or structures in the data. Usually, the analyst specifies how many groups or clusters are to be looked for in the data. 12/24/2019 21
  • 22.
    CONT… Some clusters havebeen broken down, each of these require a further iteration of the clustering algorithm. Thus, unsupervised classification is not completely without human intervention. However, it does not start with a pre-determined set of classes as in a supervised classification. 12/24/2019 22
  • 23.
  • 24.
  • 25.
  • 26.
    Steps Required forUnsupervised Classification 1) The number of classes, 2) The maximum number of iterations, 3) The maximum number of times a pixel can be moved from one cluster to another with each iteration, 4) The minimum distance from the mean, and 5) The maximum standard deviation allowable. 2612/24/2019
  • 27.
  • 28.
    Unsupervised and SupervisedClassification 2812/24/2019
  • 29.
    29 ADVANCED IMAGE PROCESSING SOFT(SUB-PIXEL) CLASSIFICATION  Mixed Pixel is one of the characteristic of remote sensing datasets  Additionally, pixels containing more than one landuse-land cover classes also need to be examined  The spectral reflectance of a mixed pixel is an average of the use land cover classes in it.  There are many algorithms developed to classify mixed pixel having different classes in it, some and most used are:  MLC in fuzzy form  Linear Mixture Modelling  Fuzzy C-Mean Classifier  Neural Networks Classifier 12/24/2019
  • 30.
    30 ADVANCED IMAGE PROCESSING A Neural Networks is a massively parallel distributed processor made up of simple processing units, which has a natural tendency for storing experimental knowledge and making it available for use. It resembles brain in two respects:  Knowledge is acquired b the network from its environment through a learning process  Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge  An important characteristic of Artificial Neural Networks (ANN) are:  Their non-parametric nature, which assumes no priori knowledge, particularly of the frequency distribution of the data.  Their adaptability and their ability to produce results with classification accuracies, that are higher than those generated by statistical classifiers,  Capabilities in handling complex datasets leading to an increasing amount of research in the remote sensing field (Paola and Schowengerdt 1995b, Atkinson and Tatnall 1997). Neural Networks Classifier 12/24/2019
  • 31.
    31 ADVANCED IMAGE PROCESSING The rapid uptake of neural approaches in remote sensing is due mainly to their widely demonstrated ability to:  perform more accurately and rapidly than other techniques such as statistical classifiers, particularly when the feature space is complex and the source data has nonlinear/different statistical distributions  incorporate a priori knowledge and realistic physical constraints into the analysis (Brown and Harris 1994);  incorporate different types of data (including those from different sensors) into the analysis, thus facilitating synergistic studies (Benediktsson et al. 1993, Benediktsson and Sveinsson 1997);  process with no underlying model that assumed for the multivariate distribution of the specific data in feature space, i.e they are distribution-free;  it does not require the data to follow the sequence of probability density function.  adapt non-linearity and can be conceived as a complex mathematical function that converts input data (e.g., landuse parameters such as slope, buffer distance, population density etc) to a desired output (e.g., growth prediction).  has better accuracy and evaluate the training and testing match before prediction.  Once the training addresses the testing requirement with the required level of accuracy, the prediction could be performed to achieve higher accuracy 12/24/2019
  • 32.
    32 ADVANCED IMAGE PROCESSING NEURALNETWORKS ARCHITECTURE Synapses/Connecting Links Neurons/Nodes 1 11W 1 12W 1 13W 2 11W 2 21W 2 31W 2 12W 2 22W 2 32W 1 21W 1 22W 1 23W Weight associated with the path connecting the jth element of the ith layer to the kth element of the (i+1)th layer. i jkW 12/24/2019
  • 33.
    33 ADVANCED IMAGE PROCESSING InputNeurons  Number of Input Neuron is determined based on the number of Bands, Hidden Neurons  The number of nodes in the hidden layers defines the complexity and power of the neural network model to delineate underlying relationships and structures inherent in a dataset.  The number of hidden layer nodes has a considerable effect on both classification accuracy and training time requirements.  The level of classification accuracy that can be produced by a neural network is related to the generalization capabilities of that network,  That the number of nodes in the hidden layer(s) should be large enough for the correct representation of the problem, Parameters Affecting Neural Networks Classification 1. Number of Neurons/Nodes I. Architectural Parameters 12/24/2019
  • 34.
    34 ADVANCED IMAGE PROCESSING But at the same time low enough to have adequate generalization capabilities.  Networks that are too small cannot identify the internal structure of the data (a state known as under fitting) and therefore produce lower classification accuracies.  Networks that are too large are Back propagating artificial neural networks in land cover classification likely to become over specific to the training data.  Number of Neurons in the Hidden Layers are dependent on the number of nodes/ Neurons in the input layer and summarized as follows: where Ni represents the number of input nodes. 12/24/2019
  • 35.
    35 ADVANCED IMAGE PROCESSING Niis the number of input units, NW is the total number of weights in the network, 1. Number of training samples  The number of training samples employed at the learning stage has a significant impact on the performance of any classifier.  Although the size of the training data is of considerable importance, the characteristics and the distributions of the data, as well as the sampling strategy used, can affect the results.  Heuristics proposed for the computation of the optimum number of training samples: Output Neurons  Number of Neurons in the Output Layers are dependent on the number of classes required from the remote sensor datasets. II. Training Parameters 12/24/2019
  • 36.
    36 ADVANCED IMAGE PROCESSING 2.Initial of Weights  The initial weight values define a starting location on the multi-dimensional error surface  When large initial values are assigned to the weights, it is likely that causes the learning process to slow down  If the initial weight values are assigned small values, the backpropagation algorithm may operate on a very flat area around the origin of the error surface 3. Learning Rate and Momentum Factors  The learning rate determines the size of the steps taken in the search for the global minimum of the error function in the training process  The momentum term uses the previous weight configuration to determine the direction of search for the global minimum of the error 4. Training Iterations  The model compute through its algorithm to classify until  The error goal required is achieved or  The Maximum iteration is met II. Training Parameters 12/24/2019