There are many methods for determining the Classification Accuracy. In this paper significance of Entropy of
training signatures in Classification has been shown. Entropy of training signatures of the raw digital image
represents the heterogeneity of the brightness values of the pixels in different bands. This implies that an image
comprising a homogeneous lu/lc category will be associated with nearly the same reflectance values that would
result in the occurrence of a very low entropy value. On the other hand an image characterized by the
occurrence of diverse lu/lc categories will consist of largely differing reflectance values due to which the
entropy of such image would be relatively high. This concept leads to analyses of classification accuracy.
Although Entropy has been used many times in RS and GIS but its use in determination of classification
accuracy is new approach.
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...IJDKP
ABSTRACT
The objectives of this paper were to 1) develop an empirical method for selecting relevant attributes for modelling drought and 2) select the most relevant attribute for drought modelling and predictions in the Greater Horn of Africa (GHA). Twenty four attributes from different domain areas were used for this experimental analysis. Two attribute selection algorithms were used for the current study: Principal Component Analysis (PCA) and correlation-based attribute selection (CAS). Using the PCA and CAS algorithms, the 24 attributes were ranked by their merit value. Accordingly, 15 attributes were selected for modelling drought in GHA. The average merit values for the selected attributes ranged from 0.5 to 0.9. The methodology developed here helps to avoid the uncertainty of domain experts’ attribute selection
challenges, which are unsystematic and dominated by somewhat arbitrary trial. Future research may evaluate the developed methodology using relevant classification techniques and quantify the actual information gain from the developed approach.
An Application of Genetic Algorithm for Non-restricted Space and Pre-determin...drboon
The use of a genetic algorithm is presented to solve a facility layout problem in the situation where there is non-restricted space but the ratio of plant length and width is pre-determined. A two-leveled chromosome is constructed. Six rules are established to translate the chromosome to facility design. An approach of solving a facility layout problem is proposed. A numerical example is employed to illustrate the approach.
Effect of Feature Selection on Gene Expression Datasets Classification Accura...IJECEIAES
Feature selection attracts researchers who deal with machine learning and data mining. It consists of selecting the variables that have the greatest impact on the dataset classification, and discarding the rest. This dimentionality reduction allows classifiers to be fast and more accurate. This paper traits the effect of feature selection on the accuracy of widely used classifiers in literature. These classifiers are compared with three real datasets which are pre-processed with feature selection methods. More than 9% amelioration in classification accuracy is observed, and k-means appears to be the most sensitive classifier to feature selection
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...csandit
Attribute reduction and classification task are an essential process in dealing with large data
sets that comprise numerous number of input attributes. There are many search methods and
classifiers that have been used to find the optimal number of attributes. The aim of this paper is
to find the optimal set of attributes and improve the classification accuracy by adopting
ensemble rule classifiers method. Research process involves 2 phases; finding the optimal set of
attributes and ensemble classifiers method for classification task. Results are in terms of
percentage of accuracy and number of selected attributes and rules generated. 6 datasets were
used for the experiment. The final output is an optimal set of attributes with ensemble rule
classifiers method. The experimental results conducted on public real dataset demonstrate that
the ensemble rule classifiers methods consistently show improve classification accuracy on the
selected dataset. Significant improvement in accuracy and optimal set of attribute selected is
achieved by adopting ensemble rule classifiers method.
COMPARISON OF HIERARCHICAL AGGLOMERATIVE ALGORITHMS FOR CLUSTERING MEDICAL DO...ijseajournal
Extensive amount of data stored in medical documents require developing methods that help users to find
what they are looking for effectively by organizing large amounts of information into a small number of
meaningful clusters. The produced clusters contain groups of objects which are more similar to each other
than to the members of any other group. Thus, the aim of high-quality document clustering algorithms is to
determine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity is
maximized. The most important feature in many clustering algorithms is treating the clustering problem as
an optimization process, that is, maximizing or minimizing a particular clustering criterion function
defined over the whole clustering solution. The only real difference between agglomerative algorithms is
how they choose which clusters to merge. The main purpose of this paper is to compare different
agglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchical
agglomerative clustering algorithms using different criterion functions for the problem of clustering
medical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as its
criterion function for choosing which clusters to merge produced better clusters quality than the other
criterion functions in term of entropy and purity as external measures.
Improving of classification accuracy of cyst and tumor using local polynomial...TELKOMNIKA JOURNAL
Cyst and tumor in oral cavity are seriously noticed by health experts along with increasing death cases of oral cancer in developing country. Early detection of cyst and tumor using dental panoramic image is needed since its initial growth does not cause any complaints. Image processing is done by mean for distinguishing the classification of cyst and tumor. The results in previous studies about classification of cyst and tumor were done by using a mathematical computation approach namely supports vector machine method that have still not satisfied and have not been validated. Therefore, in this study we propose a method, i.e., nonparametric regression model based on local polynomial estimator that can be improve the classification accuracy of cyst and tumor on human dental panoramic image. By using the proposed method, we get the classification accuracy of cyst and tumor, i.e., 90.91% which is greater than those by using the support vector machine method, i.e., 76.67%. Also, in validation process we obtain that the nonparametric regression model approach gives a significant Press’s Q statistical testing value. So, we conclude that the nonparametric regression model approach improves the classification accuracy and gives better outcome to classify cyst and tumor using dental panoramic image than the support vector machine method.
Dimensionality Reduction Evolution and Validationiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...IJDKP
ABSTRACT
The objectives of this paper were to 1) develop an empirical method for selecting relevant attributes for modelling drought and 2) select the most relevant attribute for drought modelling and predictions in the Greater Horn of Africa (GHA). Twenty four attributes from different domain areas were used for this experimental analysis. Two attribute selection algorithms were used for the current study: Principal Component Analysis (PCA) and correlation-based attribute selection (CAS). Using the PCA and CAS algorithms, the 24 attributes were ranked by their merit value. Accordingly, 15 attributes were selected for modelling drought in GHA. The average merit values for the selected attributes ranged from 0.5 to 0.9. The methodology developed here helps to avoid the uncertainty of domain experts’ attribute selection
challenges, which are unsystematic and dominated by somewhat arbitrary trial. Future research may evaluate the developed methodology using relevant classification techniques and quantify the actual information gain from the developed approach.
An Application of Genetic Algorithm for Non-restricted Space and Pre-determin...drboon
The use of a genetic algorithm is presented to solve a facility layout problem in the situation where there is non-restricted space but the ratio of plant length and width is pre-determined. A two-leveled chromosome is constructed. Six rules are established to translate the chromosome to facility design. An approach of solving a facility layout problem is proposed. A numerical example is employed to illustrate the approach.
Effect of Feature Selection on Gene Expression Datasets Classification Accura...IJECEIAES
Feature selection attracts researchers who deal with machine learning and data mining. It consists of selecting the variables that have the greatest impact on the dataset classification, and discarding the rest. This dimentionality reduction allows classifiers to be fast and more accurate. This paper traits the effect of feature selection on the accuracy of widely used classifiers in literature. These classifiers are compared with three real datasets which are pre-processed with feature selection methods. More than 9% amelioration in classification accuracy is observed, and k-means appears to be the most sensitive classifier to feature selection
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...csandit
Attribute reduction and classification task are an essential process in dealing with large data
sets that comprise numerous number of input attributes. There are many search methods and
classifiers that have been used to find the optimal number of attributes. The aim of this paper is
to find the optimal set of attributes and improve the classification accuracy by adopting
ensemble rule classifiers method. Research process involves 2 phases; finding the optimal set of
attributes and ensemble classifiers method for classification task. Results are in terms of
percentage of accuracy and number of selected attributes and rules generated. 6 datasets were
used for the experiment. The final output is an optimal set of attributes with ensemble rule
classifiers method. The experimental results conducted on public real dataset demonstrate that
the ensemble rule classifiers methods consistently show improve classification accuracy on the
selected dataset. Significant improvement in accuracy and optimal set of attribute selected is
achieved by adopting ensemble rule classifiers method.
COMPARISON OF HIERARCHICAL AGGLOMERATIVE ALGORITHMS FOR CLUSTERING MEDICAL DO...ijseajournal
Extensive amount of data stored in medical documents require developing methods that help users to find
what they are looking for effectively by organizing large amounts of information into a small number of
meaningful clusters. The produced clusters contain groups of objects which are more similar to each other
than to the members of any other group. Thus, the aim of high-quality document clustering algorithms is to
determine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity is
maximized. The most important feature in many clustering algorithms is treating the clustering problem as
an optimization process, that is, maximizing or minimizing a particular clustering criterion function
defined over the whole clustering solution. The only real difference between agglomerative algorithms is
how they choose which clusters to merge. The main purpose of this paper is to compare different
agglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchical
agglomerative clustering algorithms using different criterion functions for the problem of clustering
medical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as its
criterion function for choosing which clusters to merge produced better clusters quality than the other
criterion functions in term of entropy and purity as external measures.
Improving of classification accuracy of cyst and tumor using local polynomial...TELKOMNIKA JOURNAL
Cyst and tumor in oral cavity are seriously noticed by health experts along with increasing death cases of oral cancer in developing country. Early detection of cyst and tumor using dental panoramic image is needed since its initial growth does not cause any complaints. Image processing is done by mean for distinguishing the classification of cyst and tumor. The results in previous studies about classification of cyst and tumor were done by using a mathematical computation approach namely supports vector machine method that have still not satisfied and have not been validated. Therefore, in this study we propose a method, i.e., nonparametric regression model based on local polynomial estimator that can be improve the classification accuracy of cyst and tumor on human dental panoramic image. By using the proposed method, we get the classification accuracy of cyst and tumor, i.e., 90.91% which is greater than those by using the support vector machine method, i.e., 76.67%. Also, in validation process we obtain that the nonparametric regression model approach gives a significant Press’s Q statistical testing value. So, we conclude that the nonparametric regression model approach improves the classification accuracy and gives better outcome to classify cyst and tumor using dental panoramic image than the support vector machine method.
Dimensionality Reduction Evolution and Validationiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Multi-Population Methods with Adaptive Mutation for Multi-Modal Optimization ...ijscai
This paper presents an efficient scheme to locate multiple peaks on multi-modal optimization problems by
using genetic algorithms (GAs). The premature convergence problem shows due to the loss of diversity,
the multi-population technique can be applied to maintain the diversity in the population and the
convergence capacity of GAs. The proposed scheme is the combination of multi-population with adaptive
mutation operator, which determines two different mutation probabilities for different sites of the
solutions. The probabilities are updated by the fitness and distribution of solutions in the search space
during the evolution process. The experimental results demonstrate the performance of the proposed
algorithm based on a set of benchmark problems in comparison with relevant algorithms.
A NOVEL APPROACH FOR FEATURE EXTRACTION AND SELECTION ON MRI IMAGES FOR BRAIN...cscpconf
Feature extraction is a method of capturing visual content of an image. The feature extraction is
the process to represent raw image in its reduced form to facilitate decision making such as
pattern classification. The objective of this paper is to present a novel method of feature
selection and extraction. This approach combines the Intensity, Texture, shape based features
and classifies the tumor as white matter, Gray matter, CSF, abnormal and normal area. The
experiment is performed on 140 tumor contained brain MR images from the Internet Brain
Segmentation Repository. PCA and Linear Discriminant Analysis (LDA) were applied on the
training sets. The Support Vector Machine (SVM) classifier served as a comparison of
nonlinear techniques Vs linear ones. PCA and LDA methods are used to reduce the number of
features used. The feature selection using the proposed technique is more beneficial as it
analyses the data according to grouping class variable and gives reduced feature set with high classification accuracy.
Statistical modelling is of prime importance in each and every sphere of data analysis. This paper reviews the justification of fitting linear model to the collected data. Inappropriateness of the fitted model may be due two reasons 1.wrong choice of the analytical form, 2. Suffers from the adverse effects of outliers and/or influential observations. The aim is to identify outliers using the deletion technique. In I extend the result of deletion diagnostics to the ex- changeable model and reviews some results of model analytical form checking and the technique illustrated through an example.
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
Data clustering is a common technique for statistical data analysis; it is defined as a class of
statistical techniques for classifying a set of observations into completely different groups. Cluster analysis
seeks to minimize group variance and maximize between group variance. In this study we formulate a
mathematical programming model that chooses the most important variables in cluster analysis. A nonlinear
binary model is suggested to select the most important variables in clustering a set of data. The idea of the
suggested model depends on clustering data by minimizing the distance between observations within groups.
Indicator variables are used to select the most important variables in the cluster analysis.
Prediction model of algal blooms using logistic regression and confusion matrix IJECEIAES
Algal blooms data are collected and refined as experimental data for algal blooms prediction. Refined algal blooms dataset is analyzed by logistic regression analysis, and statistical tests and regularization are performed to find the marine environmental factors affecting algal blooms. The predicted value of algal bloom is obtained through logistic regression analysis using marine environment factors affecting algal blooms. The actual values and the predicted values of algal blooms dataset are applied to the confusion matrix. By improving the decision boundary of the existing logistic regression, and accuracy, sensitivity and precision for algal blooms prediction are improved. In this paper, the algal blooms prediction model is established by the ensemble method using logistic regression and confusion matrix. Algal blooms prediction is improved, and this is verified through big data analysis.
Comparison of Interpolation Methods in Prediction the Pattern of Basal Stem R...Waqas Tariq
Basal Stem Rot is a diseases that caused by Ganoderma Boinense that is the most serious disease for oil palm trees in Malaysia. The analysis of plant disease has been carried extensively with the advancement in computer technology. Particularly, in terms of spatial and temporal, it is very complicated to be processed. Furthermore, the application of GIS in plant disease analysis is becoming more popular, precise and advance. In previous studies, Kriging has been used to predict the pattern of BSR disease. In this study, two commonly used interpolation methods for GIS, Kriging and Inverse Distance Weighting (IDW), are used to interpolate and predict the pattern of Basal Stem Rot disease. Since the IDW method is an exact method and is more accurate one, it was expected to see more accurate results. However, the accuracy results of both methods are the same. Based on the characteristic of both methods and according to advantages and disadvantages, the Inverse Distance Weighted is recommended in this study but, for more informative data, Ordinary Kriging is suggested to be the preferable method to be used as an alternative method. .
A Survey and Comparative Study of Filter and Wrapper Feature Selection Techni...theijes
Feature selection is considered as a problem of global combinatorial optimization in machine learning, which reduces the number of features, removes irrelevant, noisy and redundant data. However, identification of useful features from hundreds or even thousands of related features is not an easy task. Selecting relevant genes from microarray data becomes even more challenging owing to the high dimensionality of features, multiclass categories involved and the usually small sample size. In order to improve the prediction accuracy and to avoid incomprehensibility due to the number of features different feature selection techniques can be implemented. This survey classifies and analyzes different approaches, aiming to not only provide a comprehensive presentation but also discuss challenges and various performance parameters. The techniques are generally classified into three; filter, wrapper and hybrid.
The purpose of this paper is to present a survey of image registration techniques. Registration is a fundamental task in image processing used to match two or more pictures taken, for example, at different times, from different sensors, or from different viewpoints. It geometrically aligns two images the reference and sensed images. Specific examples of systems where image registration is a significant component include matching a target with a real-time image of a scene. Various applications of image registration are target recognition, monitoring global land usage using satellite images, matching stereo images to recover shape for navigation, and aligning images from different medical modalities for diagnosis.
Defining Homogenous Climate zones of Bangladesh using Cluster AnalysisPremier Publishers
Climate zones of Bangladesh are identified by using mathematical methodology of cluster analysis. Monthly data from 34 climate stations for rainfall from 1991 to 2013 are used in the cluster analysis. Five Agglomerative Hierarchical clustering measures based on mostly used six proximity measures are chosen to perform the regionalization. Besides three popular measures: K-means, Fuzzy and density based clustering techniques are applied initially to decide the most suitable method for the identification of homogeneous region. Stability of the cluster is also tested based on nine validity indices. It is decided that Ward method based on Euclidean distance, K-means, Fuzzy are the most likely to yield acceptable results in this particular case, as is often the case in climatological research. In this analysis we found seven different climate zones in Bangladesh.
PARTICLE SWARM OPTIMIZATION FOR MULTIDIMENSIONAL CLUSTERING OF NATURAL LANGUA...IAEME Publication
Consider a non-linear dimensionality reduction method which takes into account
the discriminating power of the solution found for given values of the categorical
variable associated with each observation. Stochastic optimization method known as
the "Particle swarm optimization" is proposed to found characteristics that ensure the
best separation of observations in terms of a given quality functional. The basis for
evaluating the quality of the solution lies in the purity of the clusters obtained with the
k-means method, or with using self-organizing Kohonen feature maps.
A Genetic Algorithm on Optimization Test FunctionsIJMERJOURNAL
ABSTRACT: Genetic Algorithms (GAs) have become increasingly useful over the years for solving combinatorial problems. Though they are generally accepted to be good performers among metaheuristic algorithms, most works have concentrated on the application of the GAs rather than the theoretical justifications. In this paper, we examine and justify the suitability of Genetic Algorithms in solving complex, multi-variable and multi-modal optimization problems. To achieve this, a simple Genetic Algorithm was used to solve four standard complicated optimization test functions, namely Rosenbrock, Schwefel, Rastrigin and Shubert functions. These functions are benchmarks to test the quality of an optimization procedure towards a global optimum. We show that the method has a quicker convergence to the global optima and that the optimal values for the Rosenbrock, Rastrigin, Schwefel and Shubert functions are zero (0), zero (0), -418.9829 and -14.5080 respectively
International Journal of Engineering Research and Applications (IJERA) is a team of researchers not publication services or private publications running the journals for monetary benefits, we are association of scientists and academia who focus only on supporting authors who want to publish their work. The articles published in our journal can be accessed online, all the articles will be archived for real time access.
Our journal system primarily aims to bring out the research talent and the works done by sciaentists, academia, engineers, practitioners, scholars, post graduate students of engineering and science. This journal aims to cover the scientific research in a broader sense and not publishing a niche area of research facilitating researchers from various verticals to publish their papers. It is also aimed to provide a platform for the researchers to publish in a shorter of time, enabling them to continue further All articles published are freely available to scientific researchers in the Government agencies,educators and the general public. We are taking serious efforts to promote our journal across the globe in various ways, we are sure that our journal will act as a scientific platform for all researchers to publish their works online.
Data encryption using LSB matching algorithm and Reserving Room before Encryp...IJERA Editor
Now a days, more and more attention is paid to reversible data hiding (RDH) in encrypted images, since it
maintains the excellent property that the original cover can be losslessly recovered after embedded data is
extracted while protecting the image content’s confidentiality. Previously proposed methods embed data by
reversibly vacating room from the encrypted images, which may cause some errors in data during data extraction
and/or image restoration. In this paper, a novel method of reserving room before encryption with a traditional
RDH algorithm, and thus it is easy for the data hider to reversibly embed data in the encrypted image is
proposed. The proposed method can achieve real reversibility, that is, data extraction and image recovery are
free of any error.
Study Effective of Wind Load on Behavior of ShearWall in Frame StructureIJERA Editor
Wind load is really the result of wind pressures acting on the building surfaces during a wind event. This wind
pressure is primarily a function of the wind speed because the pressure or load increases with the square of the
wind velocity.Structural walls, or shear walls, are elements used to resist lateral loads, such as those generated
by wind and earthquakes. Structural walls are considerably deeper than typical beams or columns. This attribute
gives structural walls considerable in-plane stiffness which makes structural walls a natural choice for resisting
lateral loads. In addition to considerable strength, structural walls can dissipate a great deal of energy if detailed
properly. Walls are an invaluable structural element when protecting buildings from seismic events. Buildings
often rely on structural walls as the main lateral force resisting system. Shear walls are required to perform in
multiple ways. Shear walls can then be designed to limit building damage to the specified degree. The loaddeformation
response of the structural walls must be accurately predicted and related to structural damage in
order to achieve these performance goals under loading events of various magnitudes. The applied load is
generally transferred to the wall by a diaphragm or collector or drag member. The performance of the framed
buildings depends on the structural system adopted for the structure The term structural system or structural
frame in structural engineering refers to load-resisting sub-system of a structure. The structural system
transfers loads through interconnected structural components or members. These structural systems need to be
chosen based on its height and loads and need to be carried out, etc. The selection of appropriate structural
systems for building must satisfy both strength and stiffness requirements. The structural system must be
adequate to resist lateral and gravity loads that cause horizontal shear deformation and overturning deformation.
The efficiency of a structural system is measured in terms of their ability to resist lateral load, which increases
with the height of the frame. A building can be considered as tall when the effect of lateral loads is reflected in
the design. Lateral deflections of framed buildings should be limited to prevent damage to both structural and
nonstructural elements. In the present study, the structural performance of the framed building with shear wall
will be analysis.
Multi-Population Methods with Adaptive Mutation for Multi-Modal Optimization ...ijscai
This paper presents an efficient scheme to locate multiple peaks on multi-modal optimization problems by
using genetic algorithms (GAs). The premature convergence problem shows due to the loss of diversity,
the multi-population technique can be applied to maintain the diversity in the population and the
convergence capacity of GAs. The proposed scheme is the combination of multi-population with adaptive
mutation operator, which determines two different mutation probabilities for different sites of the
solutions. The probabilities are updated by the fitness and distribution of solutions in the search space
during the evolution process. The experimental results demonstrate the performance of the proposed
algorithm based on a set of benchmark problems in comparison with relevant algorithms.
A NOVEL APPROACH FOR FEATURE EXTRACTION AND SELECTION ON MRI IMAGES FOR BRAIN...cscpconf
Feature extraction is a method of capturing visual content of an image. The feature extraction is
the process to represent raw image in its reduced form to facilitate decision making such as
pattern classification. The objective of this paper is to present a novel method of feature
selection and extraction. This approach combines the Intensity, Texture, shape based features
and classifies the tumor as white matter, Gray matter, CSF, abnormal and normal area. The
experiment is performed on 140 tumor contained brain MR images from the Internet Brain
Segmentation Repository. PCA and Linear Discriminant Analysis (LDA) were applied on the
training sets. The Support Vector Machine (SVM) classifier served as a comparison of
nonlinear techniques Vs linear ones. PCA and LDA methods are used to reduce the number of
features used. The feature selection using the proposed technique is more beneficial as it
analyses the data according to grouping class variable and gives reduced feature set with high classification accuracy.
Statistical modelling is of prime importance in each and every sphere of data analysis. This paper reviews the justification of fitting linear model to the collected data. Inappropriateness of the fitted model may be due two reasons 1.wrong choice of the analytical form, 2. Suffers from the adverse effects of outliers and/or influential observations. The aim is to identify outliers using the deletion technique. In I extend the result of deletion diagnostics to the ex- changeable model and reviews some results of model analytical form checking and the technique illustrated through an example.
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
Data clustering is a common technique for statistical data analysis; it is defined as a class of
statistical techniques for classifying a set of observations into completely different groups. Cluster analysis
seeks to minimize group variance and maximize between group variance. In this study we formulate a
mathematical programming model that chooses the most important variables in cluster analysis. A nonlinear
binary model is suggested to select the most important variables in clustering a set of data. The idea of the
suggested model depends on clustering data by minimizing the distance between observations within groups.
Indicator variables are used to select the most important variables in the cluster analysis.
Prediction model of algal blooms using logistic regression and confusion matrix IJECEIAES
Algal blooms data are collected and refined as experimental data for algal blooms prediction. Refined algal blooms dataset is analyzed by logistic regression analysis, and statistical tests and regularization are performed to find the marine environmental factors affecting algal blooms. The predicted value of algal bloom is obtained through logistic regression analysis using marine environment factors affecting algal blooms. The actual values and the predicted values of algal blooms dataset are applied to the confusion matrix. By improving the decision boundary of the existing logistic regression, and accuracy, sensitivity and precision for algal blooms prediction are improved. In this paper, the algal blooms prediction model is established by the ensemble method using logistic regression and confusion matrix. Algal blooms prediction is improved, and this is verified through big data analysis.
Comparison of Interpolation Methods in Prediction the Pattern of Basal Stem R...Waqas Tariq
Basal Stem Rot is a diseases that caused by Ganoderma Boinense that is the most serious disease for oil palm trees in Malaysia. The analysis of plant disease has been carried extensively with the advancement in computer technology. Particularly, in terms of spatial and temporal, it is very complicated to be processed. Furthermore, the application of GIS in plant disease analysis is becoming more popular, precise and advance. In previous studies, Kriging has been used to predict the pattern of BSR disease. In this study, two commonly used interpolation methods for GIS, Kriging and Inverse Distance Weighting (IDW), are used to interpolate and predict the pattern of Basal Stem Rot disease. Since the IDW method is an exact method and is more accurate one, it was expected to see more accurate results. However, the accuracy results of both methods are the same. Based on the characteristic of both methods and according to advantages and disadvantages, the Inverse Distance Weighted is recommended in this study but, for more informative data, Ordinary Kriging is suggested to be the preferable method to be used as an alternative method. .
A Survey and Comparative Study of Filter and Wrapper Feature Selection Techni...theijes
Feature selection is considered as a problem of global combinatorial optimization in machine learning, which reduces the number of features, removes irrelevant, noisy and redundant data. However, identification of useful features from hundreds or even thousands of related features is not an easy task. Selecting relevant genes from microarray data becomes even more challenging owing to the high dimensionality of features, multiclass categories involved and the usually small sample size. In order to improve the prediction accuracy and to avoid incomprehensibility due to the number of features different feature selection techniques can be implemented. This survey classifies and analyzes different approaches, aiming to not only provide a comprehensive presentation but also discuss challenges and various performance parameters. The techniques are generally classified into three; filter, wrapper and hybrid.
The purpose of this paper is to present a survey of image registration techniques. Registration is a fundamental task in image processing used to match two or more pictures taken, for example, at different times, from different sensors, or from different viewpoints. It geometrically aligns two images the reference and sensed images. Specific examples of systems where image registration is a significant component include matching a target with a real-time image of a scene. Various applications of image registration are target recognition, monitoring global land usage using satellite images, matching stereo images to recover shape for navigation, and aligning images from different medical modalities for diagnosis.
Defining Homogenous Climate zones of Bangladesh using Cluster AnalysisPremier Publishers
Climate zones of Bangladesh are identified by using mathematical methodology of cluster analysis. Monthly data from 34 climate stations for rainfall from 1991 to 2013 are used in the cluster analysis. Five Agglomerative Hierarchical clustering measures based on mostly used six proximity measures are chosen to perform the regionalization. Besides three popular measures: K-means, Fuzzy and density based clustering techniques are applied initially to decide the most suitable method for the identification of homogeneous region. Stability of the cluster is also tested based on nine validity indices. It is decided that Ward method based on Euclidean distance, K-means, Fuzzy are the most likely to yield acceptable results in this particular case, as is often the case in climatological research. In this analysis we found seven different climate zones in Bangladesh.
PARTICLE SWARM OPTIMIZATION FOR MULTIDIMENSIONAL CLUSTERING OF NATURAL LANGUA...IAEME Publication
Consider a non-linear dimensionality reduction method which takes into account
the discriminating power of the solution found for given values of the categorical
variable associated with each observation. Stochastic optimization method known as
the "Particle swarm optimization" is proposed to found characteristics that ensure the
best separation of observations in terms of a given quality functional. The basis for
evaluating the quality of the solution lies in the purity of the clusters obtained with the
k-means method, or with using self-organizing Kohonen feature maps.
A Genetic Algorithm on Optimization Test FunctionsIJMERJOURNAL
ABSTRACT: Genetic Algorithms (GAs) have become increasingly useful over the years for solving combinatorial problems. Though they are generally accepted to be good performers among metaheuristic algorithms, most works have concentrated on the application of the GAs rather than the theoretical justifications. In this paper, we examine and justify the suitability of Genetic Algorithms in solving complex, multi-variable and multi-modal optimization problems. To achieve this, a simple Genetic Algorithm was used to solve four standard complicated optimization test functions, namely Rosenbrock, Schwefel, Rastrigin and Shubert functions. These functions are benchmarks to test the quality of an optimization procedure towards a global optimum. We show that the method has a quicker convergence to the global optima and that the optimal values for the Rosenbrock, Rastrigin, Schwefel and Shubert functions are zero (0), zero (0), -418.9829 and -14.5080 respectively
International Journal of Engineering Research and Applications (IJERA) is a team of researchers not publication services or private publications running the journals for monetary benefits, we are association of scientists and academia who focus only on supporting authors who want to publish their work. The articles published in our journal can be accessed online, all the articles will be archived for real time access.
Our journal system primarily aims to bring out the research talent and the works done by sciaentists, academia, engineers, practitioners, scholars, post graduate students of engineering and science. This journal aims to cover the scientific research in a broader sense and not publishing a niche area of research facilitating researchers from various verticals to publish their papers. It is also aimed to provide a platform for the researchers to publish in a shorter of time, enabling them to continue further All articles published are freely available to scientific researchers in the Government agencies,educators and the general public. We are taking serious efforts to promote our journal across the globe in various ways, we are sure that our journal will act as a scientific platform for all researchers to publish their works online.
Data encryption using LSB matching algorithm and Reserving Room before Encryp...IJERA Editor
Now a days, more and more attention is paid to reversible data hiding (RDH) in encrypted images, since it
maintains the excellent property that the original cover can be losslessly recovered after embedded data is
extracted while protecting the image content’s confidentiality. Previously proposed methods embed data by
reversibly vacating room from the encrypted images, which may cause some errors in data during data extraction
and/or image restoration. In this paper, a novel method of reserving room before encryption with a traditional
RDH algorithm, and thus it is easy for the data hider to reversibly embed data in the encrypted image is
proposed. The proposed method can achieve real reversibility, that is, data extraction and image recovery are
free of any error.
Study Effective of Wind Load on Behavior of ShearWall in Frame StructureIJERA Editor
Wind load is really the result of wind pressures acting on the building surfaces during a wind event. This wind
pressure is primarily a function of the wind speed because the pressure or load increases with the square of the
wind velocity.Structural walls, or shear walls, are elements used to resist lateral loads, such as those generated
by wind and earthquakes. Structural walls are considerably deeper than typical beams or columns. This attribute
gives structural walls considerable in-plane stiffness which makes structural walls a natural choice for resisting
lateral loads. In addition to considerable strength, structural walls can dissipate a great deal of energy if detailed
properly. Walls are an invaluable structural element when protecting buildings from seismic events. Buildings
often rely on structural walls as the main lateral force resisting system. Shear walls are required to perform in
multiple ways. Shear walls can then be designed to limit building damage to the specified degree. The loaddeformation
response of the structural walls must be accurately predicted and related to structural damage in
order to achieve these performance goals under loading events of various magnitudes. The applied load is
generally transferred to the wall by a diaphragm or collector or drag member. The performance of the framed
buildings depends on the structural system adopted for the structure The term structural system or structural
frame in structural engineering refers to load-resisting sub-system of a structure. The structural system
transfers loads through interconnected structural components or members. These structural systems need to be
chosen based on its height and loads and need to be carried out, etc. The selection of appropriate structural
systems for building must satisfy both strength and stiffness requirements. The structural system must be
adequate to resist lateral and gravity loads that cause horizontal shear deformation and overturning deformation.
The efficiency of a structural system is measured in terms of their ability to resist lateral load, which increases
with the height of the frame. A building can be considered as tall when the effect of lateral loads is reflected in
the design. Lateral deflections of framed buildings should be limited to prevent damage to both structural and
nonstructural elements. In the present study, the structural performance of the framed building with shear wall
will be analysis.
Stress Analysis and Optimization of a Piaggio Ape Clutch Plate with different...IJERA Editor
clutch plate is one of the important part in the power transmission systems. Good design of clutch provides
better engine performance. Clutch is device which is used to engage or disengage of gears and it transfers the
rotary motion of one shaft to the other shaft when desired. In automobiles friction clutches are widely used in
power transmission applications. To transmit maximum torque in friction clutches selection of the friction
material is one of the important tasks.
In this thesis a model of Piaggio Ape clutch plate has been generated in Pro-E Cre0-5 and then imported in
ANSYS for power transmission applications. We have conducted structural analysis by varying the friction
surfaces material and keeping base material aluminium same. By seeing the results, Comparison is done for both
materials to validate better lining material for Piaggio Ape clutch plate by doing analysis on clutch with help of
ANSYS software for find out which material is best for the lining of friction surfaces.
Application of probiotics in complex treatment of tuberculosisIJERA Editor
The probiotic bacteria possessing ability to suppress growth of Mycobacterium B5 are revealed. Antagonistic
activity in selected strains studied during the growth on various nutrient media. Strains adapted to the low pH
exposure. They are steady against a number of the antibiotics, used at tuberculosis treatment. This testifies to the
prospects of further studies on the use of probiotics in the
Static & Thermal Analysis of Positive Multiple Friction Plate using FEA PackageIJERA Editor
Clutch is a mechanical device, which is used to engage or disengage the source of power from the rest of the
power transmission system at the operator’s will. The clutch can connect or disconnect the driving shaft from
the driven shaft when necessary. Clutches are designed to transfer maximum torque with minimum heat
generation. During engagement and disengagement the two clutch discs has the sliding motion between them.
The project contains that designing and analysis of two positive multi friction plates.
For the designing of the friction plates 3d modeling software used and for the analysis ansys package is used.
In the analysis part the two models are analyzed with different materials by conducting two types of analysis
which are structural and thermal.
Structural analysis is done to find out the stress values and the thermal analysis done to find out the temperature
distribution on the model. By these two analysis results we are suggesting the best material to the effective
model of the multiple friction plate.
Determination of Immobilization Process Parameters of Corynebacterium glutami...IJERA Editor
The parameters of the immobilized process of Corynebacterium glutamicum on kappa carrageenan were identified by
Plackett-Burman matrix, and the experiments were designed by response surface methodology having the central
composite designs (RSM-CCD). The maximum yield of cell immobilization on kappa carrageenan carrier reached at
78% ± 2%. Optimal parameters were 3 grams kappa carrageenan per 100 militters sterile water and 58.58 million
cfu/mL, forming gels at 100C for 25 minutes and the speed when soaking particles of 150 rpm for 120 minutes in 0.58
M potassium chlorua solvent. The immobile finished products are applied in L-lysine production, their reusing ability
is 3 times and the total yield of L-lysine was accumulated 93 g/L in medium during 96 fermented hours. The L-lysine
productivity of the batch fermentation was 0.969 g.L-1
.h-1
. And the set-up storage conditions are the mixed solvent of
CaCl2 0.5% (w/v) and KCl 0.5% (w/v); pH is 7.0 in 40C. After 60 storage days, the survival cell rate was remained
51%.
Land Use and Land Cover Change Detection in Tiruchirappalli District Using Re...IJERA Editor
Land use and land cover is dominant role in the part of urbanization. As the rapid urbanization led various activities
in a region and these changes generally takes place in the agricultural land and caused decrease of arable land .The
satellite imageries LANDSAT 5TM (1990), LANDSAT 7ETM (2000) AND LISS 111 (2010) data’s are used. The
scales are 1:50,000. 1990, 2000 and 2010 covering a period of 19 years the aerial distribution of the land use and
land cover changes has been observed. The changes were identified ,in which the decrease of Agricultural land,
Natural vegetation , Scrub land and Water body and increase of Built up land, Fallow land, River sand and Without
scrub land. The land use and land cover maps are prepared by using GIS software to evaluate the changes and it is
showed strong variation.
A Review: Six Sigma Implementation Practice in Manufacturing IndustriesIJERA Editor
Higher Productivity achievement is very crucial factor for the field of production. With the High productivity
various other factors must be taken in to consideration in manufacturing industries such as global competitors,
diversity in product range, lead time and customer demand in terms of quality and quantity. A new benchmark
called Six Sigma has been invented for dealing with all these needs. Six sigma is a quality initiative which
reduces variations in a process and helps to lower the cost of product as well as process. The objective of this
paper is to review and examine the advancement and encounters of six sigma practices in Global manufacturing
Industries and identify the key tools for each step in successful Six Sigma project execution. The paper also
integrates the lessons learned from successful six sigma projects and their prospective applications in various
manufacturing Industries. In today scenario, many global manufacturing industries operate their processes at the
two to four sigma quality levels.
Modification of a Two Wheeler Suspension System using FeaIJERA Editor
A spring is defined as an elastic body, whose function is to compress when loaded and to recover its original
shape when the load is removed. A spring is a flexible element used to exert a force or a torque and, at the same
time, to store energy. The force can be a linear push or pull, or it can be radial. In two wheelers we used to see
helical suspension at the front and rear tyres on both sides. But the new model bikes are replacing the rear
double suspension with the single heavy duty suspension. Our project deals with the design and modification of
the suspension system and analyzing that can we replace one heavy duty spring in the place of double springs.
For this we have conducted structural analysis by varying the spring material and keeping base material same.
By seeing the results, Comparison is done for four materials to validate better material for suspension system by
doing analysis on spring with help of ANSYS software for find out which material is best for the suspension
system.
And also we modified the actual model and also conducting the same analysis on it and validating that which
model is better.
The modeling done in Creo-5 and analysis is done Ansys package.
Improved Grid Synchronization Algorithm for DG System using DSRF PLL under Gr...IJERA Editor
Distributed Generation (DG) System is a small scale electric power generation at or near the user’s facility as
opposed to the normal mode of centralized power generation. In order to ensure safe and reliable operation of
power system based on DS, grid synchronization algorithm plays a very important role. This paper presents a
Double Synchronous Reference Frame (DSRF) phase locked loop (PLL) based on synthesis circuit for grid
synchronization of distributed generation (DG) system under grid disturbances aimed to provide an estimation
of the angular frequency and both the positive and negative sequences of the fundamental component of an
unbalanced three-phase signal. The design of this PLL is based on a complete description of the source voltage
involving both positive and negative sequences in stationary coordinates and considering the angular frequency
as an uncertain parameter.
A Comparison of Accuracy Measures for Remote Sensing Image Classification: Ca...CSCJournals
This work investigated the consistency of both the category-level and the map-level accuracy measures for different scenarios and features using Support Vector Machine. It was verified that the classification scenario and the features adopted have not influenced the accuracy measure consistency and all accuracy measures are highly positively correlated.
DATA MINING ATTRIBUTE SELECTION APPROACH FOR DROUGHT MODELLING: A CASE STUDY ...IJDKP
The objectives of this paper were to 1) develop an empirical method for selecting relevant attributes for
modelling drought and 2) select the most relevant attribute for drought modelling and predictions in the
Greater Horn of Africa (GHA). Twenty four attributes from different domain areas were used for this
experimental analysis. Two attribute selection algorithms were used for the current study: Principal
Component Analysis (PCA) and correlation-based attribute selection (CAS). Using the PCA and CAS
algorithms, the 24 attributes were ranked by their merit value. Accordingly, 15 attributes were selected for
modelling drought in GHA. The average merit values for the selected attributes ranged from 0.5 to 0.9. The
methodology developed here helps to avoid the uncertainty of domain experts’ attribute selection
challenges, which are unsystematic and dominated by somewhat arbitrary trial. Future research may
evaluate the developed methodology using relevant classification techniques and quantify the actual
information gain from the developed approach
he data obtained from remote sensing satellites fu
rnish information about the land at varying resolut
ions
and has been widely used for change detection studi
es. There exist a huge number of change detection
methodologies and techniques with the continual eme
rgence of new ones. This paper provides a review of
pixel based and object-based change detection techn
iques in conjunction with the comparison of their
merits and limitations. The advent of very-high-res
olution remotely sensed images, exponentially incre
ased
image data volume and multiple sensors demand the p
otential use of data mining techniques in tandem
with object-based methods for change detection
Clustering heterogeneous categorical data using enhanced mini batch K-means ...IJECEIAES
Clustering methods in data mining aim to group a set of patterns based on their similarity. In a data survey, heterogeneous information is established with various types of data scales like nominal, ordinal, binary, and Likert scales. A lack of treatment of heterogeneous data and information leads to loss of information and scanty decision-making. Although many similarity measures have been established, solutions for heterogeneous data in clustering are still lacking. The recent entropy distance measure seems to provide good results for the heterogeneous categorical data. However, it requires many experiments and evaluations. This article presents a proposed framework for heterogeneous categorical data solution using a mini batch k-means with entropy measure (MBKEM) which is to investigate the effectiveness of similarity measure in clustering method using heterogeneous categorical data. Secondary data from a public survey was used. The findings demonstrate the proposed framework has improved the clustering’s quality. MBKEM outperformed other clustering algorithms with the accuracy at 0.88, v-measure (VM) at 0.82, adjusted rand index (ARI) at 0.87, and Fowlkes-Mallow’s index (FMI) at 0.94. It is observed that the average minimum elapsed time-varying for cluster generation, k at 0.26 s. In the future, the proposed solution would be beneficial for improving the quality of clustering for heterogeneous categorical data problems in many domains.
Face Recognition using Improved FFT Based Radon by PSO and PCA TechniquesCSCJournals
Abstract Face Recognition is one of the problems which can be handled very well using a Hybrid technique or mixed transform rather than single technique, it is a very well in terms of a good performance and a large size of the problem. In this paper we represent the using of the Fourier-Based Radon Transform which is improved by the Particle Swarm Optimization (PSO). PSO in this research is used to select the optimum directions (projection angles) that achieve a very high recognition rate and a fast computation. The number of directions selected using PSO is less than the number required by ordinary Radon. This leads to a small number of features. These number of features are reduced farther using PCA to produce a low number of features which used to represent faces in the database. Our method has been applied to ORL Database and achieves 100% recognition rate.
On Confidence Intervals Construction for Measurement System Capability Indica...IRJESJOURNAL
Abstract: There are many criteria that have been proposed to determine the capability of a measurement system, all based on estimates of variance components. Some of them are the Precision to Tolerance Ratio, the Signal to Noise Ratio and the probabilities of misclassification. For most of these indicators, there are no exact confidence intervals, since the exact distributions of the point estimators are not known. In such situations, two approaches are widely used to obtain approximate confidence intervals: the Modified Large Samples (MLS) methods initially proposed by Graybill and Wang, and the construction of Generalized Confidence Intervals (GCI) introduced by Weerahandi. In this work we focus on the construction of the confidence intervals by the generalized approach in the context of Gauge repeatability and reproducibility studies. Since GCI are obtained by simulation procedures, we analyze the effect of the number of simulations on the variability of the confidence limits as well as the effect of the size of the experiment designed to collect data on the precision of the estimates. Both studies allowed deriving some practical implementation guidelinesin the use of the GCI approach. We finally present a real case study in which this technique was applied to evaluate the capability of a destructive measurement system.
The Effect of Genetic Algorithm Parameters Tuning for Route Optimization in T...Muhammad Irfan Kemal
This study aims to analyze the effect of the population size, crossover probability, mutation probability, and the number of iterations on the distribution mileage of Indonesian largest logistics service provider in the Central Jakarta area with 43 distribution locations.
Multilinear Kernel Mapping for Feature Dimension Reduction in Content Based M...ijma
In the process of content-based multimedia retrieval, multimedia information is processed in order to
obtain descriptive features. Descriptive representation of features, results in a huge feature count, which
results in processing overhead. To reduce this descriptive feature overhead, various approaches have been
used to dimensional reduction, among which PCA and LDA are the most used methods. However, these
methods do not reflect the significance of feature content in terms of inter-relation among all dataset
features. To achieve a dimension reduction based on histogram transformation, features with low
significance can be eliminated. In this paper, we propose a feature dimensional reduction approaches to
achieve the dimension reduction approach based on a multi-linear kernel (MLK) modeling. A benchmark
dataset for the experimental work is taken and the proposed work is observed to be improved in analysis in
comparison to the conventional system.
Some Imputation Methods to Treat Missing Values in Knowledge Discovery in Dat...Waqas Tariq
One major problem in the data cleaning & data reduction step of KDD process is the presence of missing values in attributes. Many of analysis task have to deal with missing values and have developed several treatments to guess them. One of the most common method to replace the missing values is the mean method of imputation. In this paper we suggested a new imputation method by combining factor type and compromised imputation method, using two-phase sampling scheme and by using this method we impute the missing values of a target attribute in a data warehouse. Our simulation study shows that the estimator of mean from this method is found more efficient than compare to other.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Evaluation of image segmentation and filtering with ann in the papaya leafijcsit
Precision agriculture is area with lack of cheap technology. The refinement of the production system brings
large advantages to the producer and the use of images makes the monitoring a more cheap methodology.
Macronutrients monitoring can to determine the health and vulnerability of the plant in specific stages. In
this paper is analyzed the method based on computational intelligence to work with image segmentation in
the identification of symptoms of plant nutrient deficiency. Artificial neural networks are evaluated for
image segmentation and filtering, several variations of parameters and insertion impulsive noise were
evaluated too. Satisfactory results are achieved with artificial neural for segmentation same with high
noise levels.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Classification accuracy analyses using Shannon’s Entropy
1. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 1 | P a g e
Classification accuracy analyses using Shannon’s Entropy Shashi Poonam Indwar*, Dr.Nilanchal Patel** *(Scientist B, Groundwater Hydrology Division, NIH Roorkee, Uttrakhand, India; ) ** (HOD Department of Remote Sensing, BIT Mesra, Ranchi, Jharkhand, India; ) Abstract. There are many methods for determining the Classification Accuracy. In this paper significance of Entropy of training signatures in Classification has been shown. Entropy of training signatures of the raw digital image represents the heterogeneity of the brightness values of the pixels in different bands. This implies that an image comprising a homogeneous lu/lc category will be associated with nearly the same reflectance values that would result in the occurrence of a very low entropy value. On the other hand an image characterized by the occurrence of diverse lu/lc categories will consist of largely differing reflectance values due to which the entropy of such image would be relatively high. This concept leads to analyses of classification accuracy. Although Entropy has been used many times in RS and GIS but its use in determination of classification accuracy is new approach. Keywords: Classification, Entropy, Training signatures, Homogeneous, Heterogeneity
I. Introduction
Accuracy is considered to be the degree of closeness of results to the values accepted as true. Some of the accuracy assessment methods are: the variance analysis, minimum accuracy value used as an index of classification accuracy, spatial error and class attribute errors, a probabilistic approach for change detection and land cover classes are abstraction and generalizations of the real world in order to provide discrete values for continues. Techniques developed for accuracy assessment must take into consideration the factors that are sources of error in image and the methods used for assessing accuracy in a single image and for a pair of images. Assessing the accuracy of change detection products is an important step for the integration of remote sensed data to environmental management system as a decision making support tool. The influence of accuracy and classification performance based on the confusion matrix and derived; overall classification accuracy, producer‟s accuracy and kappa coefficient in change detection studies, the factors that are influencing the accuracy assessment and the accuracy assessment aspects for change detection and classification, with and without test data and cross- validation methods (Fl.Zavoianu, A. Caramizoiu , D.Badea). There are many sources of both conservative and optimistic bias in classification accuracy assessment. The three sources of optimistic bias: use of training data for accuracy assessment, restriction of reference data sampling to homogeneous areas, and sampling of reference data not independent of training data. The magnitude and direction of bias in classification accuracy estimates depends on the methods used for classification and reference data sampling (T.O. Hammond and D. L. Verbyla). The main objective of the paper was to assess classification accuracy of classified forest map on Landsat TM data from different number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data) and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. The result showed that an overall accuracy from 200 reference data was 83.5% with (kappa value 0.7502459, kappa variance 0.002871). However, when 200 references was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17% with Kappa statistic increased from 0.75022459 to 0.8026135, respectively(Mohd Hasmadi Ismail and Kamaruzaman Jusoff). A geostatistical (model- based) framework for spatial accuracy assessment of land-cover classifications was developed. The key component of the proposed framework was its ability to account for spatial or spatiotemporal correlation in observed classification errors, as well as to accommodate different data supports, without relying on probability-based sampling designs. Under this geostatistical framework, confidence intervals were derived for classification accuracy in each class, overall accuracy among all classes and the kappa coefficient (Phaedon C. Kyriakidis and Jingxiong Zhang). 48 surface soil samples representing Yazd- Ardakan plain were collected and surface soil salinity was measured. Ten soil samples for investigation of map accuracy were applied. The obtained soil samples and other more ten soil samples which basically had high similarity in spectral reflectance
RESEARCH ARTICLE OPEN ACCESS
2. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 2 | P a g e
and geomorphological characteristics were used to examine the produced soil salinity map and to assess its accuracy. According to results the produced soil salinity map had an overall accuracy equal to 87% and Kappa index equal to 47% indicating an acceptable accuracy for this classification (R. Taghizadeh Mehrjardi, Sh. Mahmoodi, M. Taze and E. Sahebjalal). Before implementing a classification accuracy assessment, one needs to know the sources of errors (Congalton and Green 1993, Powell et al. 2004). In addition to errors from the classification itself, other sources of errors, such as position errors resulting from the registration, interpretation errors, and poor quality of training or test samples, all affect classification accuracy. In the process of accuracy assessment, it is commonly assumed that the difference between an image classification result and the reference data is due to the classification error. However, in order to provide a reliable report on classification accuracy, non- image classification errors should also be examined, especially when reference data are not obtained from a field survey. A classification accuracy assessment generally includes three basic components: sampling design, response design, and estimation and analysis procedures (Stehman and Czaplewski 1998). Selection of a suitable sampling strategy is a critical step (Congalton 1991). The major components of a sampling strategy include sampling unit (pixels or polygons), sampling design, and sample size (Muller et al. 1998).Possible sampling designs include random, stratified random, systematic, double, and cluster sampling. A detailed description of sampling techniques can be found in previous literature such as Stehman and Czaplewski (1998) and Congalton and Green (1999).The error matrix approach is the one most widely used in accuracy assessment (Foody 2002b). In order to properly generate an error matrix, one must consider the following factors: (1) reference data collection, (2) classification scheme, (3) sampling scheme, (4) spatial autocorrelation, and (5) sample size and sample unit (Congalton and Plourde 2002). After generation of an error matrix, other important accuracy assessment elements, such as overall accuracy, omission error, commission error, and kappa coefficient, can be derived. Previous literature has defined the meanings and provided computation methods for these elements (Congalton and Mead 1983, Hudson and Ramm 1987, Congalton 1991, Janssen and van der Wel 1994, Kalkhan et al. 1997, Stehman 1996, 1997, Congalton and Green 1999, Smits et al. 1999, Congalton and Plourde 2002, Foody 2002b, 2004a). Meanwhile, many authors, such as Congalton (1991), Janssen and van der Wel (1994), Smits et al. (1999), and Foody (2002b), have conducted reviews on classification accuracy assessment. They have assessed the status of accuracy assessment of image classification, and discussed relevant issues. Congalton and Green (1999) systematically reviewed the concept of basic accuracy assessment and some advanced topics involved in fuzzy-logic and multilayer assessments, and explained principles and practical considerations in designing and conducting accuracy assessment of remote-sensing data. The Kappa coefficient is a measure of overall statistical agreement of an error matrix, which takes non-diagonal elements into account. Kappa analysis is recognized as a powerful method for analysing a single error matrix and for comparing the differences between various error matrices (Congalton1991, Smits et al. 1999, Foody 2004a). Modified kappa coefficient and tau coefficient have been developed as improved measures of classification accuracy (Foody 1992, Ma and Redmond 1995). Moreover, accuracy assessment based on a normalized error matrix has been conducted, which is regarded as a better presentation than the conventional error matrix (Congalton 1991, Hardin and Shumway 1997, Stehman 2004).The error matrix approach is only suitable for „hard‟ classification, assuming that the map categories are mutually exclusive and exhaustive and that each location belongs to a single category. This assumption is often violated, especially for classifications with coarse spatial resolution imagery. „Soft‟ classifications have been performed to minimize the mixed pixel problem using a fuzzy logic. The traditional error matrix approach is not appropriate for evaluating these soft classification results. Accordingly, many new measures, such as conditional entropy and mutual information (Finn 1993, Maselli et al. 1994), fuzzy-set approaches (Gopal and Woodcock 1994, Binaghi et al. 1999, Woodcock and Gopal 2000), symmetric index of information closeness (Foody 1996), Renyi generalized entropy function (Ricotta and Avena 2002), and parametric generalization of Morisita‟s index (Ricotta 2004) have been developed. However, one critical issue in assessing fuzzy classifications is the difficulty of collecting reference data. More research is thus needed to find a suitable approach for evaluating fuzzy classification results. In summary, the error matrix approach is the most common accuracy assessment approach for categorical classes. Uncertainty and confidence analysis of classification results has gained some attention recently (McIver and Friedl 2001, Liu et al.2004), and spatially explicit data on mapping confidence are regarded as an important aspect in effectively employing classification results for decision making (McIver and Friedl 2001, Liu et al. 2004). In this paper a new concept has been introduced that attempts to establish relationship between homogeneity of training Brightness Values and Classification Accuracy determined. Three sets of training signatures viz. pure, impure and moderately pure for
3. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 3 | P a g e
individual lu/lc categories were extracted from the heart or centre, periphery and somewhere between centre and periphery of the respective classes. Subsequently, classification was performed using each of the three sets of training signatures separately. Classification Accuracy was determined from the classified images generated by using the respective training samples. The classification accuracy was examined in relation to the training samples of three types.This is based on the logic that pure training pixels extracted from the centre would comprise more homogeneous Brightness Values resulting in low entropy value while training signatures (impure) extracted from the periphery of a class would comprise relatively heterogeneous Brightness Values with higher entropy value. Thus entropy of the training signatures could be used as a potential indicator of the purity of the signatures. 1.1 Significance of Entropy of Training Signatures in Classification: Entropy of the raw digital image represents the heterogeneity of the brightness values of the pixels in different bands. This implies that an image comprising a homogeneous lu/lc category will be associated with nearly the same reflectance values that would result in the occurrence of a very low entropy value. On the other hand an image characterized by the occurrence of diverse lu/lc categories will consist of largely differing reflectance values due to which the entropy of such image would be relatively high. On the classified image, entropy would be determined by the number of the land use and land cover categories present. If an image consists of only one category, its entropy would be zero. Mathematically, entropy expresses the disorder of a system (here, spectral band) which is given by the following formula (O‟Neill et al, 1988). H = -Σ pi log pi Where H = Entropy of the spectral band of the image pi = fi / N fi = frequency N = total number of pixels.
Entropy of training samples can be used to determine its purity i.e. how homogeneous the training pixels are. Entropy helps in determining the purity of the training samples. Entropy of training data set is related to the accuracy of classification. As classification is one of the major techniques used for mapping of impervious and pervious layers in this thesis, it becomes essential to determine the purity of the training samples. If impurity is high for training data set then it will signify the presence of heterogeneous signatures which will lead to less classification accuracy whereas if impurity is less for
training samples it will signify homogeneous signatures resulting in more classification accuracy. The entropy of the pure training samples will be lower while the impure training samples will exhibit higher entropy values. It is normally expected that for purer training samples a particular lu/lc category will result in high classification accuracy. Therefore, the classification accuracy of any category could be related to the entropy of the training samples of that category. Higher classification accuracy of a particular category will be achieved if the entropy of the training samples for that category is smaller and vice versa. In other words, entropy of the training samples can be considered as a significant indicator of how pure they are, which in turn could direct the analyst to choose more accurate training samples and /or change sampling strategy. In addition, the classification accuracy could also be tested vis-à-vis the entropy of the training samples of different categories although there might be other factors playing vital role in the classification accuracy such as the classification technique involved, type of data used i.e. whether it is raw data or atmospherically corrected data, resolution of the data etc. (Congalton & Green, 1999) 1.2 Objectives: The present study has been carried out with the following objectives in mind. (i) To compute the entropy values of the three different types of the training signatures (viz. pure, moderately pure and impure) based on the purity or homogeneity of the pixels values for the respective four lu/lc categories considered for carrying out the investigation such as Standing water bodies, Forest, Agriculture land and Dense Built-up area. (ii) To determine the difference in the entropy values of the three different types of signatures for the respective lu/lc category. (iii) To correlate between the entropy computed for each type of the training signatures (viz. pure, impure and moderately pure) with the classification accuracy obtained by using the corresponding training signature for the respective lu/lc category. These tasks have been performed for the satellite data of both the years, i.e. 1996 and 2004 in order to revalidate the analysis. 1.3 Data used: IRS 1C LISS-III (105/055) of 22nd December, 1996 IRS P6 LISS-III (105/055) of 20th February, 2004 1.4 Software used: (i) Erdas Imagine 8.5 (ii) Arc view (iii) Arc-GIS (Arc Map) (iv) MS- Excel
4. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 4 | P a g e
1.5 Detailed Methodology:
(i) Three sets of training signatures viz. pure, moderately pure, and impure were extracted for four classes i.e. standing water bodies, forest, agricultural lands and built-up using ERDAS 8.5 Software.
(ii) Training samples (aoi) of each class i.e. standing water bodies, forest, agricultural lands and built-up were subset from the main image through Data Preparation-Subset Image, using group icon of aoi.
(iii) Frequency and total number of training pixels band wise were noted from the subset images of each class via the raster attribute table.
(iv) From raster attribute editor, the following command was executed: “select all and copy frequency and total number of pixels in Microsoft excel to calculate pi and log pi”.
(v) Put the values of pi and log pi in the formula of H = -Σ pi*logpi.
(vi) Represent H values in tables and graphs to show the relation between impurity and classification accuracy.
(vii) Classification of images of each year based on the three types of training signatures i.e. pure, moderately pure and impure samples.
(viii) Determination of Producer‟s accuracy based on the different types of training signatures.
(ix) Generation of the (a) Plot between purity of training signatures and entropy (b) Plot between entropy and Producer‟s accuracy (c) Plot between purity of training signatures and Producer‟s accuracy and comparative analysis between the two years.
(x) Comparative analysis and discussion of results.
1.5.1 Methodology Flowchart:
Note frequency and total number of pixels band wise from attribute table of training data sets for each class
Calculate pi and log pi in Microsoft excel
Put the values of pi & log pi in formula of entropy (H) = -Σpi * log pi and compute H
Classification of images of each year based on the three types of training signatures i.e. pure, moderately pure and impure samples
Subset the training data sets selected from the main image (FCC)
Determination of Producer‟s Accuracy (PA) based on the different types of training signatures
Selection of pure, moderately pure and impure data set for the four lu/lc classes respectively from the images of both years i.e. 1996 & 2004 respectively
Generation of the (i) plot between purity of training signatures and entropy (ii) plot between entropy and PA (iii) plot between purity of training signatures and PA and comparative analysis between the two years
Comparative Analysis and Discussion of Results
5. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 5 | P a g e
1.6 Procedure/ Strategy of Sampling of the Training Signatures: In the present study four lu/lc categories were selected for performing the task of determining the relation between entropy and classification accuracy based on the purity of the signatures viz. standing water bodies, forest, agricultural land and built-up area. The standing water bodies comprise a large reservoir and a lake. The forest is characterized by three crown densities such as dense, moderate and open; the built-up area mainly comprises the highly dense congested conglomerate of the residential and shopping areas located at the centre of the city; and the agricultural land comprises the vast stretch of fallow land without standing crop.
For each of these categories, training samples corresponding to the pure, impure and moderately pure type were extracted from the centre, periphery and somewhere between the centre and periphery respectively taking into consideration the homogeneity of the brightness values at these three locations. For example, the training pixels are expected to be more homogeneous towards the centre of any lu/lc and become more heterogeneous towards the periphery. Similar criteria for the selection of training samples have been employed on the images of both the years i.e. 1996 and 2004. The purpose of performing the entropy-training samples homogeneity-classification accuracy relationship analyses in two different years is primarily to revalidate the findings from this study keeping in view the lu/lc dynamics and its impact on the pixel homogeneity within the different categories. The standard Maximum Likelihood classifier has been employed to classify the satellite images of both the years‟ i.e. 1996 and 2004 using the three types of training signatures viz. pure, impure and moderately pure selected for the four different lu/lc categories in the present study. Table 1.1 ENTROPY (H) values of the three types of signatures for different lu/lc categories (a) Impure Training Signature:
Lu/Lc
Standing water bodies
Forest
Agricultural Lands
Bt-up
Year
1996
2004
1996
2004
1996
2004
1996
2004
Band 1(G)
0.27643
0.88737
0.53009
1.06534
1.01555
1.14234
1.09333
1.28063
Band 2(R)
0.27643
0.82047
1.06758
1.13094
1.08071
1.15497
0.97657
1.24588
Band 3(NIR)
0.47712
0.72832
0.98863
1.15132
1.00041
1.04199
1.11772
1.23226
Band 4(MIR)
0.47712
0.82047
1.2408
1.18921
1.00041
1.10889
1.21037
1.27702
(b) Moderately Pure Training Signature:
Lu/Lc
Standing water bodies
Forest
Agricultural Lands
Bt-up
Year
1996
2004
1996
2004
1996
2004
1996
2004
Band 1 (G)
0.30102
0.73644
1.05843
1.09636
0.97882
0.93191
1.09333
1.2455
Band 2 (R)
0.30102
0.87958
1.09252
1.09636
0.92865
0.87718
0.97657
1.22924
Band 3 (NIR)
0.30102
0.93979
1.07547
1.10833
0.7005
0.89197
1.11772
1.17287
Band 4 (MIR)
0.30102
0.81937
1.1679
1.10833
0.85954
1.04137
1.21037
1.17287
6. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 6 | P a g e
(c) Pure Training Signature:
Lu/Lc
Standing water bodies
Forest
Agricultural Lands
Bt-up
Year
1996
2004
1996
2004
1996
2004
1996
2004
Band 1 (G)
0.30102
0.30102
0.74707
1.04199
0.71496
0.77814
1.00092
1.136
Band 2 (R)
0.30102
0.30102
0.82916
1.02117
0.82785
0.77814
0.9149
1.0557
Band 3 (NIR)
0.30102
0.30102
0.91662
0.81603
0.82785
0.6778
1.10317
0.85499
Band 4 (MIR)
0.30102
0.30102
0.9678
1.10889
0.90312
0.6778
1.10317
1.04056
1.7 Results and Discussion: 1.7.1 Entropy values of the three types of training signatures 1.7.1.1 Standing water bodies Examination of the figures 1.1 a (i) and (ii) reveal the following observations.
(i) The entropy values decrease as the purity of the training signature increases.
(ii) Comparative analysis of the entropy values between the years 1996 and 2004 reveals that the entropy values are considerably lower in the year 1996 in the respective categories of training signatures as compared to 2004.
(iii) The entropy values for the three different categories of training signatures are nearly same in 1996 while in the year 2004 there occur considerable difference in the entropy between impure and pure training signatures.
(iv) The pure training signatures are associated with very low entropy values. This signifies that there occurs maximum homogeneity in the brightness values at the centre of water bodies from where pure training samples are selected while the pixels homogeneity drastically decreases as one move away from the centre towards the periphery.
(v) There occurs no systematic variation of the entropy values among the different spectral bands.
(vi) The occurrence of the first observation could be attributed to the prevalence of homogeneous condition within the water body in the year 1996 as compared to 2004.
1.7.1.2 Forest Comparative analyses of the figures 1.1 b (i) and (ii) indicate the following.
(i) There occurs little variation in the entropy values between the two years; however, the entropy values decrease as the purity of the
training signatures increase in both the years which could be attributed to the occurrence of pixel homogeneity towards the central portion of the forest class as it is observed in the field while the pixel homogeneity decreases as one moves towards the periphery where the canopy density decreases giving rise to the prevalence of mixed or impure training signatures.
(ii) Another significant observation that is apparent from the analysis of the figures is the occurrence of highest entropy in the MIR band which could be attributed to the high spectral reflectance characteristics of vegetation.
1.7.1.3 Agriculture Agricultural lands exhibit nearly the similar observation as that of the forest. Entropy increases as the impurity of the training signature increases (Figures 1.1 c (i) and (ii). 1.7.1.4 Built-up areas In both the years i.e. 1996 and 2004, the entropy values for the impure and moderately pure training signatures are nearly same with the former category possessing slightly more entropy value than the latter one. However, the pure training signatures are associated with significantly lower entropy values (Figure 1.1 d (i) and (ii)).
Comparison among the entropy values of the respective training signatures viz. pure, impure and moderately pure among the four lu/lc categories in the individual years indicates that the water bodies possess the smallest values while the remaining three lu/lc categories viz. forest, agricultural land and built- up are associated with nearly the same entropy values. This observation is attributed to the fact that water bodies in the study area are characterized by significantly large amount of homogeneity as compared to the other classes. As such the
7. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 7 | P a g e
agricultural land in the study area comprises the fallow land with no standing crop thereby resulting in larger variation in the brightness values. 1.7.2 Correlation between Entropy and Producer‟s Accuracy Figures1.2 (a-d) show the relationship between the Producer‟s accuracy and combined entropy of the three different types of training signatures viz. pure, impure and moderate for the four lu/lc categories in the respective years, i.e. 1996 and 2004. Analyses of the figures reveal that the classification accuracy decreases as the entropy or the impurity increases. There occurs little or no variation in the classification accuracy obtained by considering the pure and moderately pure training signatures while the classification accuracy drastically decreases for the impure samples. 1.7.3 Comparison among the Producer‟s Accuracy of Different Types of Training Signatures Figures1.3 (a-d) shows together the bar chart of the producer‟s accuracy determined for the lu/lc categories by considering the three types of training signatures for 1996 and 2004. Comparative analysis of the classification accuracy bar charts of the lu/lc categories reveals the following.
(i) First, there occurs highest classification accuracy for the pure samples with a decreasing trend towards the impure samples in both the years.
(ii) Second, the classification accuracy associated with the impure samples is appreciably lower
in the year 2004 as compared to 1996 that may be attributed to the occurrence of a large amount of heterogeneity in the spectral characteristics of different lu/lc classes (towards their periphery) in the year 2004.
1.8 Conclusion: The following conclusion can be drawn from the analyses carried out in this chapter.
1. The entropy of the training signatures is strongly related to their purity. Pure training signatures give rise to low entropy that is characteristic of the homogeneity of brightness values whereas impure training samples are associated with large entropy resulted due to heterogeneous pattern of brightness values. In other words, entropy of the training samples could be used as a potential indicator of the purity of the training samples.
2. There also occurs significant relationship between producer‟s accuracy and the entropy of the training samples. Producer‟s accuracy is found to be higher for pure training samples characterized by the low entropy values whereas moderately pure and impure training samples associated with larger entropy values lead to lower producer‟s accuracy.
3. Water bodies as expected are associated with larger homogeneity of brightness values with lower entropy values as compared to the other lu/lc categories.
a(i) a(ii)
8. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 8 | P a g e
b (i) b(ii)
c (i) c(ii)
d (i) d(ii) Figure 1.1 Entropy Values of the Three Signature Types for Different Lu/Lc.
9. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 9 | P a g e
a (i) a(ii)
b (i) b(ii)
c (i) c(ii)
PLOT BETWEEN PA AND ENTROPY FOR
STANDING WATERBODIES (1996)
97
98
99
100
1.20408 1.20408 1.5071
PURE MODERATE IMPURE
ENTROPY
PA(%)
PLOT BETWEEN PA AND ENTROPY FOR
STANDING WATERBODIES (2004)
92
96
100
1.20408 3.37518 3.25663
PURE MODERATE IMPURE
ENTROPY
PA%
PLOT BETWEEN PA AND ENTROPY FOR
FOREST AREAS (1996)
94
96
98
100
3.46065 4.39432 3.8271
PURE MODERATE IMPURE
ENTROP Y
PA(%)
PLOT BETWEEN PA AND ENTROPY FOR
AGRICULTURAL LANDS (1996)
90
94
98
3.27378 3.46751 4.09
PURE MODERATE IMPURE
ENTROPY
PA(%)
PLOT BETWEEN PA AND ENTROPY FOR
AGRICULTURAL LANDS (2004)
80
85
90
95
100
2.91188 3.74243 4.44819
PURE MODERATE IMPURE
ENTROPY
PA%
10. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 10 | P a g e
d (i) d(ii)
Figure 1.2 Correlation Plot between Producer’s Accuracy (PA) and Entropy of Different Lu/Lc.
(a) (b)
(c) (d)
Figure 1.3 Correlation among Pure, Impure and Moderately Pure Training Signatures and Producer’s
Accuracy (PA) of Different Lu/Lc.
PLOT BETWEEN PA AND ENTROPY FOR
BUILTUP (1996)
85
90
95
100
4.12216 4.39799 4.33964
PURE MODERATE IMPURE
ENTROPY
PA%
PLOT BETWEEN PA AND ENTROPY FOR
BUILTUP (2004)
60
70
80
90
100
4.08725 4.82048 5.03579
PURE MODERATE IMPURE
ENTROPY
PA%
STANDING WATERBODIES
92
94
96
98
100
Pure Moderate Impure
IMPURITY
PA(%)
1996
2004
FOREST
88
92
96
100
Pure Moderate Impure
IMPURITY
PA(%)
1996
2004
AGRICULTURAL LANDS
80
85
90
95
100
Pure Moderate Impure
IMPURITY
PA(%)
1996
2004
BUILT-UP
80
85
90
95
100
Pure Moderate Impure
IMPURITY
PA(%)
1996
2004
11. Shashi Poonam Indwar Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 4), November 2014, pp.01-12
www.ijera.com 11 | P a g e
Figure 1.1 FCC of IRS 1C LISS III Data of December 1996 with Training Signatures of Different Purity Superimposed.
References
[1.] Binaghi, E., Brivio, P.A., Ghezzi, P. and Rampini, A., A fuzzy set accuracy assessment of soft classification. Pattern Recognition Letters, 1999, 20, pp. 935–948.
[2.] Congalton, R.G., A review of assessing the accuracy of classification of remotely sensed data. Remote Sensing of Environment, 1991, 37, pp. 35–46.
[3.] Congalton, R.G. and Green, K., A practical look at the sources of confusion in error matrix generation. Photogrammetric Engineering and Remote Sensing, 1993, 59, pp.641–644.
[4.] Congalton, R.G. and Green, K., 1999, Assessing the Accuracy of Remotely Sensed Data: Principles and practices (Boca Raton, London, New York: Lewis Publishers).
[5.] Congalton, R.G. and Mead, R.A., A quantitative method to test for consistency and correctness in photo interpretation. Photogrammetric Engineering and Remote Sensing, 1983, 49, pp. 69–74.
[6.] Congalton, R.G. and Plourde, L., Quality assurance and accuracy assessment of information derived from remotely sensed data. In J. Bossler (Ed.), Manual of Geospatial Science and Technology (London: Taylor & Francis), 2002, pp. 349– 361.
[7.] Finn, J.T., Use of the average mutual information index in evaluating classification error and consistency. International Journal of Geographical Information Systems, 1993, 7, pp. 349–366.
[8.] Foody, G.M., On the compensation for chance agreement in image classification accuracy assessment. Photogrammetric Engineering and Remote Sensing, 1992, 58, pp.1459–1460.
[9.] Foody, G.M., Approaches for the production and evaluation of fuzzy land cover classification from remotely-sensed data. International Journal of Remote Sensing, 1996, 17, pp. 1317–1340.
[10.] Foody, G.M., Hard and soft classifications by a neural network with a nonexhaustively