The document describes the weights-of-evidence methodology, which uses spatial data from diverse sources to analyze relationships and make predictive models. It was originally developed for medical diagnosis but has been adapted for mineral potential mapping using GIS. The Arc-WofE extension calculates weights for evidential themes like rock types or geochemistry based on their association with training points like known mineral sites. These weights are combined to produce a response theme map expressing the probability of finding more sites. The methodology involves exploring spatial relationships, generalizing evidential themes, and testing the predictive model.
This document summarizes and compares methods for evaluating the performance of ground motion prediction equations (GMPEs) using observed ground motion databases. It evaluates several CENA GMPEs using the NGA-East database through residual analysis and other statistical tests to check for bias and normality. It then ranks the GMPEs using the log-likelihood and Euclidean distance-based methods to determine the most appropriate models for the target region.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
This document proposes a point pairing method to characterize unknown space objects using non-resolved photometry data. The method is based on the principle of material frame indifference from mechanics. It identifies pairs of observation points where the contribution of either the solar panel or body to the observed brightness is identical, enabling separation of these contributions. When a point pair is found where the body contribution is the same, and the solar panel vector directions change marginally, the method provides three equations and three unknowns that can be solved to determine the albedo-area products of the solar panel and body. The point pairing method allows characterization using multi-spectral data by equalizing the number of unknowns and equations.
Comparison of statistical methods commonly used in predictive modelingSalford Systems
This document compares four statistical methods commonly used in predictive modelling: Logistic Multiple Regression (LMR), Principal Component Regression (PCR), Classification and Regression Tree analysis (CART), and Multivariate Adaptive Regression Splines (MARS). It applies these methods to two ecological data sets to test their accuracy, reliability, ease of use, and implementation in a geographic information system (GIS). The results show that independent data is needed to validate models, and that MARS and CART achieved the best prediction success, although CART models became too complex for cartographic purposes with a large number of data points.
This document summarizes a study on projecting future forest fire danger in the Mediterranean region using regional climate models from the ENSEMBLES project. It finds that using maximum temperature and minimum relative humidity as proxies, it can estimate the Canadian Fire Weather Index to assess fire danger from the climate model output. Analyzing multiple regional climate models driven by two global models reveals a robust positive signal of increased fire danger potential over large areas of the Mediterranean in the future, with the signal strengthening later in the century.
This document discusses challenges in analyzing large spatiotemporal environmental datasets and presents hierarchical Gaussian process models as a framework to address these challenges. It describes a joint NASA and Forest Service initiative applying these models to map forest variables in Interior Alaska using sparse ground plots and airborne/spaceborne LiDAR data. The key challenges are incorporating diverse spatial data sources, handling missingness/misalignment, accounting for spatial dependence, propagating uncertainty, and scaling to massive datasets.
This document describes the PEER NGA-West2 ground motion database, which expands the existing NGA database to include over 21,000 additional ground motion records from 600 small-to-moderate California earthquakes as well as data from major international earthquakes since 2003. The database contains seismological metadata, processed time series, and response spectra for magnitudes ranging from 3 to 7.9 and distances from 0.05 to 1533 km. It more than doubles the size of the previous NGA database for moderate-large quakes and will be used to update ground motion prediction equations.
This document summarizes and compares methods for evaluating the performance of ground motion prediction equations (GMPEs) using observed ground motion databases. It evaluates several CENA GMPEs using the NGA-East database through residual analysis and other statistical tests to check for bias and normality. It then ranks the GMPEs using the log-likelihood and Euclidean distance-based methods to determine the most appropriate models for the target region.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
This document proposes a point pairing method to characterize unknown space objects using non-resolved photometry data. The method is based on the principle of material frame indifference from mechanics. It identifies pairs of observation points where the contribution of either the solar panel or body to the observed brightness is identical, enabling separation of these contributions. When a point pair is found where the body contribution is the same, and the solar panel vector directions change marginally, the method provides three equations and three unknowns that can be solved to determine the albedo-area products of the solar panel and body. The point pairing method allows characterization using multi-spectral data by equalizing the number of unknowns and equations.
Comparison of statistical methods commonly used in predictive modelingSalford Systems
This document compares four statistical methods commonly used in predictive modelling: Logistic Multiple Regression (LMR), Principal Component Regression (PCR), Classification and Regression Tree analysis (CART), and Multivariate Adaptive Regression Splines (MARS). It applies these methods to two ecological data sets to test their accuracy, reliability, ease of use, and implementation in a geographic information system (GIS). The results show that independent data is needed to validate models, and that MARS and CART achieved the best prediction success, although CART models became too complex for cartographic purposes with a large number of data points.
This document summarizes a study on projecting future forest fire danger in the Mediterranean region using regional climate models from the ENSEMBLES project. It finds that using maximum temperature and minimum relative humidity as proxies, it can estimate the Canadian Fire Weather Index to assess fire danger from the climate model output. Analyzing multiple regional climate models driven by two global models reveals a robust positive signal of increased fire danger potential over large areas of the Mediterranean in the future, with the signal strengthening later in the century.
This document discusses challenges in analyzing large spatiotemporal environmental datasets and presents hierarchical Gaussian process models as a framework to address these challenges. It describes a joint NASA and Forest Service initiative applying these models to map forest variables in Interior Alaska using sparse ground plots and airborne/spaceborne LiDAR data. The key challenges are incorporating diverse spatial data sources, handling missingness/misalignment, accounting for spatial dependence, propagating uncertainty, and scaling to massive datasets.
This document describes the PEER NGA-West2 ground motion database, which expands the existing NGA database to include over 21,000 additional ground motion records from 600 small-to-moderate California earthquakes as well as data from major international earthquakes since 2003. The database contains seismological metadata, processed time series, and response spectra for magnitudes ranging from 3 to 7.9 and distances from 0.05 to 1533 km. It more than doubles the size of the previous NGA database for moderate-large quakes and will be used to update ground motion prediction equations.
A comparative analysis of predictve data mining techniquesMintu246
This document compares four statistical methods commonly used for predictive modeling in ecology: Logistic Multiple Regression (LMR), Classification and Regression Tree analysis (CART), Principal Component Regression (PCR), and Multivariate Adaptive Regression Splines (MARS). It applies these methods to two datasets on moss and tree distributions to evaluate their predictive accuracy, reliability when validated with independent data, usability within GIS, and ease of use. MARS and CART achieved the best predictions, though CART models were complex for cartographic uses. Validation with independent data was found to be necessary, rather than just cross-validation.
Greetings all,
Nowadays, several datasets are -or will be- available in a near future to improve operational forecasting in most aspects, like the
ocean dynamics modeling, and the assimilation efficiency, that aims now to optimize the combination of temperature/salinity in
situ profiles, drifter's velocities, and sea surface height deduce from altimeter's data and GRACE or future Goce geoid. But also
strengthen forecasting system's applications, like the climate monitoring. For all these issues, an optimal use of ocean data,
always too sparse and not enough numerous, is mandatory.
Such studies are at the heart of this Newsletter issue. It begins with a Rio M.H. and Hernandez F. review of the Goce Mission,
dedicated to focus and document the shortest scales of the Earth's gravity field. Goce satellite is due to fly in December 2007.
With the next article Guinéhut S. and Larnicol G. investigate the influence of the in situ temperature profiles sampling on the
thermosteric sea level estimation. They show that the impact is not negligible, and can introduce large errors in the estimation. In
the second article, Benkiran M. and Greiner E. are evaluating the benefits of the drifter's velocities assimilation in the Mercator
Océan 1/3° Tropical and North Atlantic operational system. A description of the assimilation scheme upgrade to take into account
velocity control is given. Castruccio F. & al. describe in the third article the performance of an improved MDT reference for
altimetric data assimilation. They concentrate their study on the Tropical Pacific Ocean. Finally, the Newsletter comes to an end
with the Benkiran M. article. In his study, based on the 1/3° Mercator system, the impact of several altimeters data on the
assimilation performance is assessed
Have a good read
Self-organizing maps (SOMs) can be used to classify seismic attributes in an unsupervised manner and reveal geological patterns that improve seismic interpretation. SOMs reduce a large set of seismic attributes into a smaller set of clusters that relate to geologic features of interest. The classified seismic data can then be analyzed using computer vision algorithms like convolutional neural networks to automatically identify depositional sequences, seismic facies, play types, leads, and prospects. This facilitates a more robust and timely seismic interpretation that can help identify hydrocarbon traps and reduce exploration risk and costs.
5A_ 2_Developing a statistical methodology to improve classification and mapp...GISRUK conference
The document summarizes work using statistical analysis to determine if Multi-Beam Echo Sounder data collected over multiple years using different equipment settings can be integrated to improve seabed classification. The analysis identified spatially overlapping data subsets but statistical testing found differences that make integration unlikely without rescaling. For the next phase, classification will proceed on individual datasets without integration to map seabed types while avoiding potential errors from joining varied data.
“The strong attraction of gravity at Geoscience Australia” (Richard Lane, Geo...Richard Lane
The document discusses Geoscience Australia's work with gravity data, including airborne and satellite gravity. It aims to improve national gravity databases by expanding coverage and enhancing modeling capabilities. Key efforts include supporting airborne gravity surveys, developing a test site to evaluate new technologies, maintaining fundamental gravity networks, and improving 3D modeling through national and international collaborations.
PhD defence presentation, 12 July 2016 @ FU-BerlinSajid Pareeth
This document presents the key milestones and findings of a dissertation that developed a 30-year climate data record of daily lake surface water temperatures for five large lakes in Northern Italy using satellite remote sensing. The author resolved geometric issues with earlier satellite data, developed a homogenization method to produce consistent temperature time series from 1986-2015, and validated the results finding high accuracy compared to in-situ measurements. Statistical analysis revealed significant warming trends in the annual and summer mean lake surface temperatures over the 30-year period.
The document discusses geophysical methods for geothermal exploration. It provides an overview of basic geophysics concepts and signatures of a geothermal system that can be identified using geophysics, including the heat source, reservoir, cap rock and recharge system. Standard geophysical techniques for geothermal exploration are mentioned, such as magnetotellurics, time domain electromagnetics, and gravity surveys. Examples of geophysical surveys and their results in identifying geothermal reservoirs in Indonesia are presented.
This document discusses using unsupervised support vector analysis to increase the efficiency of simulation-based functional verification. It describes applying an unsupervised machine learning technique called support vector analysis to filter redundant tests from a set of verification tests. By clustering similar tests into regions of a similarity metric space, it aims to select the most important tests to verify a design while removing redundant tests, improving verification efficiency. The approach trains an unsupervised support vector model on an initial set of simulated tests and uses it to filter future tests by comparing them to support vectors that define regions in the similarity space.
This document discusses spatial databases and spatial data mining. It introduces spatial databases as databases that store large amounts of space-related data with special data types for spatial information. Spatial data mining extracts patterns and relationships from spatial data. The document also discusses spatial data warehousing with dimensions and measures for spatial and non-spatial data, mining spatial association patterns from spatial databases, techniques for spatial clustering, classification, and trend analysis.
This thesis examines techniques for increasing the portability of land cover classifiers across different remote sensing images. It explores four approaches from machine learning: 1) intelligently sampling images to collect labels for the most useful pixels, 2) projecting pixel data into a common space to reduce radiometric differences, 3) assessing distortions from multi-angle images using a non-parametric distribution distance measure, and 4) using sparse representations to convert between image data spaces and synthesize pixels with similar characteristics across images. The goal is to enable robust land cover mapping over large areas and times with minimal ground truth data.
This thesis examines techniques for increasing the portability of land cover classifiers across different remote sensing images. It explores four approaches from machine learning: 1) intelligently sampling images to collect labels for the most useful pixels, 2) projecting pixel data into a common space to reduce radiometric differences, 3) measuring distortions from multi-angle images using a non-parametric distribution distance, and 4) using sparse representations to convert between image data spaces and synthesize pixels with similar characteristics across images. The goal is to enable robust land cover mapping over large areas with minimal ground truth data.
This document proposes methods for evaluating the quality of unsupervised anomaly detection algorithms when labeled data is unavailable. It introduces two label-free performance criteria called Excess-Mass (EM) and Mass-Volume (MV) curves, which are based on existing concepts but adapted here. To address issues with high-dimensional data, a feature subsampling methodology is described. An experiment evaluates three anomaly detection algorithms on various datasets using the proposed EM and MV criteria, finding they accurately discriminate algorithm performance compared to labeled ROC and PR criteria.
ER Publication,
IJETR, IJMCTR,
Journals,
International Journals,
High Impact Journals,
Monthly Journal,
Good quality Journals,
Research,
Research Papers,
Research Article,
Free Journals, Open access Journals,
erpublication.org,
Engineering Journal,
Science Journals,
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
This document summarizes a study that used hyperspectral imagery to classify land cover in the campus of the University Putra Malaysia (UPM) using support vector machine (SVM) and maximum likelihood classification algorithms. The researcher classified the imagery into 9 land cover classes and found that SVM produced a higher overall classification accuracy of 98.23% compared to 90.48% for maximum likelihood classification. The study demonstrated that SVM is better suited than maximum likelihood for classifying hyperspectral data.
The document discusses methods for integrating imprecise or biased secondary data from multiple sources into mineral resource estimates. It examines techniques including cokriging (CK), multicollocated cokriging (MCCK), and ordinary kriging with variance of measurement error (OKVME). Where abundant imprecise but unbiased secondary data are available, the document recommends using OKVME, as it improves estimation precision over using the primary data alone or pooling the data and using ordinary kriging. For abundant imprecise and biased secondary data, CK is recommended as it provides an unbiased estimate and some improvement in precision relative to using only the primary data. The document evaluates these techniques using a case study iron
This document discusses the concept of equifinality in complex environmental systems modeling. Equifinality refers to the idea that there are many different model structures and parameter sets that can produce similar and acceptable results when modeling system behavior. The generalized likelihood uncertainty estimation (GLUE) methodology is described as a way to account for equifinality by using ensembles of behavioral models weighted by their likelihood to estimate prediction uncertainties. An example application to rainfall-runoff modeling is used to illustrate the GLUE methodology.
Novel Class Detection Using RBF SVM Kernel from Feature Evolving Data Streamsirjes
In the data mining field the classification of data stream creates many problems. The challenges
faces in the data stream are infinite length, concept drift, concept evaluation and feature evolution. Most of the
existing system focuses on the only first two challenges. We propose a framework in which each classifier is
prepared with the novel class detector for addressing the two challenges concept drift and concept evaluation
and for addressing the feature evolution feature set homogeneous technique is proposed. We improved the
novel class detection module by building it more adaptive to evolving the stream. SVM based feature extraction
for RBF kernel method is also proposed for detecting the novel class from the steaming data. By using the
concept of permutation and combination RBF kernel extracts the features and find out the relation between
them. This improves the novel class detect technique and provide more accuracy for classifying the data
Soft Computing Techniques Based Image Classification using Support Vector Mac...ijtsrd
n this paper we compare different kernel had been developed for support vector machine based time series classification. Despite the better presentation of Support Vector Machine SVM on many concrete classification problems, the algorithm is not directly applicable to multi dimensional routes having different measurements. Training support vector machines SVM with indefinite kernels has just fascinated consideration in the machine learning public. This is moderately due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite. In this paper, by spreading the Gaussian RBF kernel by Gaussian elastic metric kernel. Gaussian elastic metric kernel is extended version of Gaussian RBF. The extended version divided in two ways time wrap distance and its real penalty. Experimental results on 17 datasets, time series data sets show that, in terms of classification accuracy, SVM with Gaussian elastic metric kernel is much superior to other kernels, and the ultramodern similarity measure methods. In this paper we used the indefinite resemblance function or distance directly without any conversion, and, hence, it always treats both training and test examples consistently. Finally, it achieves the highest accuracy of Gaussian elastic metric kernel among all methods that train SVM with kernels i.e. positive semi definite PSD and Non PSD, with a statistically significant evidence while also retaining sparsity of the support vector set. Tarun Jaiswal | Dr. S. Jaiswal | Dr. Ragini Shukla ""Soft Computing Techniques Based Image Classification using Support Vector Machine Performance"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23437.pdf
Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/23437/soft-computing-techniques-based-image-classification-using-support-vector-machine-performance/tarun-jaiswal
Hellinger distance based drift detection forDSPG Bangaluru
This document proposes a Hellinger distance-based method for detecting concept drift in non-stationary environments. It begins with background on concept drift and the need for algorithms to detect when the underlying data distribution changes. The proposed method, called Hellinger Distance Drift Detection Method (HDDDM), uses the Hellinger distance to monitor changes between the distribution of new data batches and a reference distribution, and performs a statistical test to detect if a significant change (drift) has occurred. It is a classifier-free, batch-based approach that analyzes raw feature distributions rather than classifier performance. The Hellinger distance is well-suited as it makes no distribution assumptions and provides a bounded measure of distributional divergence.
A comparison of particle swarm optimization and the genetic algorithm by Rani...Pim Piepers
Particle Swarm Optimization (PSO) and the Genetic Algorithm (GA) are population-based heuristic search methods inspired by biological phenomena. This paper aims to compare the computational effectiveness and efficiency of PSO and GA through hypothesis testing on benchmark problems and engineering design optimization problems. PSO is described as being similar to GA but with lower computational cost. The hypothesis tested is whether PSO has equal effectiveness to GA in finding optimal solutions but with better computational efficiency through fewer function evaluations. Three benchmark problems and two engineering design problems are used to test the performance of PSO and GA.
This document compares different machine learning techniques for web page classification, including k-Nearest Neighbors, Naive Bayes, Support Vector Machine, Classification and Regression Trees, Random Forest, and Particle Swarm Optimization. Experiments were performed using two datasets to evaluate the accuracy of each technique. The document discusses the implementation methodology, including representations of web pages, performance metrics, and the classification algorithms.
A comparative analysis of predictve data mining techniquesMintu246
This document compares four statistical methods commonly used for predictive modeling in ecology: Logistic Multiple Regression (LMR), Classification and Regression Tree analysis (CART), Principal Component Regression (PCR), and Multivariate Adaptive Regression Splines (MARS). It applies these methods to two datasets on moss and tree distributions to evaluate their predictive accuracy, reliability when validated with independent data, usability within GIS, and ease of use. MARS and CART achieved the best predictions, though CART models were complex for cartographic uses. Validation with independent data was found to be necessary, rather than just cross-validation.
Greetings all,
Nowadays, several datasets are -or will be- available in a near future to improve operational forecasting in most aspects, like the
ocean dynamics modeling, and the assimilation efficiency, that aims now to optimize the combination of temperature/salinity in
situ profiles, drifter's velocities, and sea surface height deduce from altimeter's data and GRACE or future Goce geoid. But also
strengthen forecasting system's applications, like the climate monitoring. For all these issues, an optimal use of ocean data,
always too sparse and not enough numerous, is mandatory.
Such studies are at the heart of this Newsletter issue. It begins with a Rio M.H. and Hernandez F. review of the Goce Mission,
dedicated to focus and document the shortest scales of the Earth's gravity field. Goce satellite is due to fly in December 2007.
With the next article Guinéhut S. and Larnicol G. investigate the influence of the in situ temperature profiles sampling on the
thermosteric sea level estimation. They show that the impact is not negligible, and can introduce large errors in the estimation. In
the second article, Benkiran M. and Greiner E. are evaluating the benefits of the drifter's velocities assimilation in the Mercator
Océan 1/3° Tropical and North Atlantic operational system. A description of the assimilation scheme upgrade to take into account
velocity control is given. Castruccio F. & al. describe in the third article the performance of an improved MDT reference for
altimetric data assimilation. They concentrate their study on the Tropical Pacific Ocean. Finally, the Newsletter comes to an end
with the Benkiran M. article. In his study, based on the 1/3° Mercator system, the impact of several altimeters data on the
assimilation performance is assessed
Have a good read
Self-organizing maps (SOMs) can be used to classify seismic attributes in an unsupervised manner and reveal geological patterns that improve seismic interpretation. SOMs reduce a large set of seismic attributes into a smaller set of clusters that relate to geologic features of interest. The classified seismic data can then be analyzed using computer vision algorithms like convolutional neural networks to automatically identify depositional sequences, seismic facies, play types, leads, and prospects. This facilitates a more robust and timely seismic interpretation that can help identify hydrocarbon traps and reduce exploration risk and costs.
5A_ 2_Developing a statistical methodology to improve classification and mapp...GISRUK conference
The document summarizes work using statistical analysis to determine if Multi-Beam Echo Sounder data collected over multiple years using different equipment settings can be integrated to improve seabed classification. The analysis identified spatially overlapping data subsets but statistical testing found differences that make integration unlikely without rescaling. For the next phase, classification will proceed on individual datasets without integration to map seabed types while avoiding potential errors from joining varied data.
“The strong attraction of gravity at Geoscience Australia” (Richard Lane, Geo...Richard Lane
The document discusses Geoscience Australia's work with gravity data, including airborne and satellite gravity. It aims to improve national gravity databases by expanding coverage and enhancing modeling capabilities. Key efforts include supporting airborne gravity surveys, developing a test site to evaluate new technologies, maintaining fundamental gravity networks, and improving 3D modeling through national and international collaborations.
PhD defence presentation, 12 July 2016 @ FU-BerlinSajid Pareeth
This document presents the key milestones and findings of a dissertation that developed a 30-year climate data record of daily lake surface water temperatures for five large lakes in Northern Italy using satellite remote sensing. The author resolved geometric issues with earlier satellite data, developed a homogenization method to produce consistent temperature time series from 1986-2015, and validated the results finding high accuracy compared to in-situ measurements. Statistical analysis revealed significant warming trends in the annual and summer mean lake surface temperatures over the 30-year period.
The document discusses geophysical methods for geothermal exploration. It provides an overview of basic geophysics concepts and signatures of a geothermal system that can be identified using geophysics, including the heat source, reservoir, cap rock and recharge system. Standard geophysical techniques for geothermal exploration are mentioned, such as magnetotellurics, time domain electromagnetics, and gravity surveys. Examples of geophysical surveys and their results in identifying geothermal reservoirs in Indonesia are presented.
This document discusses using unsupervised support vector analysis to increase the efficiency of simulation-based functional verification. It describes applying an unsupervised machine learning technique called support vector analysis to filter redundant tests from a set of verification tests. By clustering similar tests into regions of a similarity metric space, it aims to select the most important tests to verify a design while removing redundant tests, improving verification efficiency. The approach trains an unsupervised support vector model on an initial set of simulated tests and uses it to filter future tests by comparing them to support vectors that define regions in the similarity space.
This document discusses spatial databases and spatial data mining. It introduces spatial databases as databases that store large amounts of space-related data with special data types for spatial information. Spatial data mining extracts patterns and relationships from spatial data. The document also discusses spatial data warehousing with dimensions and measures for spatial and non-spatial data, mining spatial association patterns from spatial databases, techniques for spatial clustering, classification, and trend analysis.
This thesis examines techniques for increasing the portability of land cover classifiers across different remote sensing images. It explores four approaches from machine learning: 1) intelligently sampling images to collect labels for the most useful pixels, 2) projecting pixel data into a common space to reduce radiometric differences, 3) assessing distortions from multi-angle images using a non-parametric distribution distance measure, and 4) using sparse representations to convert between image data spaces and synthesize pixels with similar characteristics across images. The goal is to enable robust land cover mapping over large areas and times with minimal ground truth data.
This thesis examines techniques for increasing the portability of land cover classifiers across different remote sensing images. It explores four approaches from machine learning: 1) intelligently sampling images to collect labels for the most useful pixels, 2) projecting pixel data into a common space to reduce radiometric differences, 3) measuring distortions from multi-angle images using a non-parametric distribution distance, and 4) using sparse representations to convert between image data spaces and synthesize pixels with similar characteristics across images. The goal is to enable robust land cover mapping over large areas with minimal ground truth data.
This document proposes methods for evaluating the quality of unsupervised anomaly detection algorithms when labeled data is unavailable. It introduces two label-free performance criteria called Excess-Mass (EM) and Mass-Volume (MV) curves, which are based on existing concepts but adapted here. To address issues with high-dimensional data, a feature subsampling methodology is described. An experiment evaluates three anomaly detection algorithms on various datasets using the proposed EM and MV criteria, finding they accurately discriminate algorithm performance compared to labeled ROC and PR criteria.
ER Publication,
IJETR, IJMCTR,
Journals,
International Journals,
High Impact Journals,
Monthly Journal,
Good quality Journals,
Research,
Research Papers,
Research Article,
Free Journals, Open access Journals,
erpublication.org,
Engineering Journal,
Science Journals,
Engineering Research Publication
Best International Journals, High Impact Journals,
International Journal of Engineering & Technical Research
ISSN : 2321-0869 (O) 2454-4698 (P)
www.erpublication.org
This document summarizes a study that used hyperspectral imagery to classify land cover in the campus of the University Putra Malaysia (UPM) using support vector machine (SVM) and maximum likelihood classification algorithms. The researcher classified the imagery into 9 land cover classes and found that SVM produced a higher overall classification accuracy of 98.23% compared to 90.48% for maximum likelihood classification. The study demonstrated that SVM is better suited than maximum likelihood for classifying hyperspectral data.
The document discusses methods for integrating imprecise or biased secondary data from multiple sources into mineral resource estimates. It examines techniques including cokriging (CK), multicollocated cokriging (MCCK), and ordinary kriging with variance of measurement error (OKVME). Where abundant imprecise but unbiased secondary data are available, the document recommends using OKVME, as it improves estimation precision over using the primary data alone or pooling the data and using ordinary kriging. For abundant imprecise and biased secondary data, CK is recommended as it provides an unbiased estimate and some improvement in precision relative to using only the primary data. The document evaluates these techniques using a case study iron
This document discusses the concept of equifinality in complex environmental systems modeling. Equifinality refers to the idea that there are many different model structures and parameter sets that can produce similar and acceptable results when modeling system behavior. The generalized likelihood uncertainty estimation (GLUE) methodology is described as a way to account for equifinality by using ensembles of behavioral models weighted by their likelihood to estimate prediction uncertainties. An example application to rainfall-runoff modeling is used to illustrate the GLUE methodology.
Novel Class Detection Using RBF SVM Kernel from Feature Evolving Data Streamsirjes
In the data mining field the classification of data stream creates many problems. The challenges
faces in the data stream are infinite length, concept drift, concept evaluation and feature evolution. Most of the
existing system focuses on the only first two challenges. We propose a framework in which each classifier is
prepared with the novel class detector for addressing the two challenges concept drift and concept evaluation
and for addressing the feature evolution feature set homogeneous technique is proposed. We improved the
novel class detection module by building it more adaptive to evolving the stream. SVM based feature extraction
for RBF kernel method is also proposed for detecting the novel class from the steaming data. By using the
concept of permutation and combination RBF kernel extracts the features and find out the relation between
them. This improves the novel class detect technique and provide more accuracy for classifying the data
Soft Computing Techniques Based Image Classification using Support Vector Mac...ijtsrd
n this paper we compare different kernel had been developed for support vector machine based time series classification. Despite the better presentation of Support Vector Machine SVM on many concrete classification problems, the algorithm is not directly applicable to multi dimensional routes having different measurements. Training support vector machines SVM with indefinite kernels has just fascinated consideration in the machine learning public. This is moderately due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite. In this paper, by spreading the Gaussian RBF kernel by Gaussian elastic metric kernel. Gaussian elastic metric kernel is extended version of Gaussian RBF. The extended version divided in two ways time wrap distance and its real penalty. Experimental results on 17 datasets, time series data sets show that, in terms of classification accuracy, SVM with Gaussian elastic metric kernel is much superior to other kernels, and the ultramodern similarity measure methods. In this paper we used the indefinite resemblance function or distance directly without any conversion, and, hence, it always treats both training and test examples consistently. Finally, it achieves the highest accuracy of Gaussian elastic metric kernel among all methods that train SVM with kernels i.e. positive semi definite PSD and Non PSD, with a statistically significant evidence while also retaining sparsity of the support vector set. Tarun Jaiswal | Dr. S. Jaiswal | Dr. Ragini Shukla ""Soft Computing Techniques Based Image Classification using Support Vector Machine Performance"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23437.pdf
Paper URL: https://www.ijtsrd.com/computer-science/artificial-intelligence/23437/soft-computing-techniques-based-image-classification-using-support-vector-machine-performance/tarun-jaiswal
Hellinger distance based drift detection forDSPG Bangaluru
This document proposes a Hellinger distance-based method for detecting concept drift in non-stationary environments. It begins with background on concept drift and the need for algorithms to detect when the underlying data distribution changes. The proposed method, called Hellinger Distance Drift Detection Method (HDDDM), uses the Hellinger distance to monitor changes between the distribution of new data batches and a reference distribution, and performs a statistical test to detect if a significant change (drift) has occurred. It is a classifier-free, batch-based approach that analyzes raw feature distributions rather than classifier performance. The Hellinger distance is well-suited as it makes no distribution assumptions and provides a bounded measure of distributional divergence.
A comparison of particle swarm optimization and the genetic algorithm by Rani...Pim Piepers
Particle Swarm Optimization (PSO) and the Genetic Algorithm (GA) are population-based heuristic search methods inspired by biological phenomena. This paper aims to compare the computational effectiveness and efficiency of PSO and GA through hypothesis testing on benchmark problems and engineering design optimization problems. PSO is described as being similar to GA but with lower computational cost. The hypothesis tested is whether PSO has equal effectiveness to GA in finding optimal solutions but with better computational efficiency through fewer function evaluations. Three benchmark problems and two engineering design problems are used to test the performance of PSO and GA.
This document compares different machine learning techniques for web page classification, including k-Nearest Neighbors, Naive Bayes, Support Vector Machine, Classification and Regression Trees, Random Forest, and Particle Swarm Optimization. Experiments were performed using two datasets to evaluate the accuracy of each technique. The document discusses the implementation methodology, including representations of web pages, performance metrics, and the classification algorithms.
Spatial analysis & interpolation in ARC GISKU Leuven
In ArcGIS, a data model describes the thematic layers used in the applications (for example, hamburger stands, roads, and counties); their spatial representation (for example, point, line, or polygon); their attributes; their integrity rules and relationships (for example, counties must nest within states).
The models, principles and steps of Bayesian time series analysis and forecasting have been established extensively during the past fifty years. In order to estimate parameters of an autoregressive (AR) model we develop Markov chain Monte Carlo (MCMC) schemes for inference of AR model. It is our interest to propose a new prior distribution placed directly on the AR parameters of the model. Thus, we revisit the stationarity conditions to determine a flexible prior for AR model parameters. A MCMC procedure is proposed to estimate coefficients of AR(p) model. In order to set Bayesian steps, we determined prior distribution with the purpose of applying MCMC. We advocate the use of prior distribution placed directly on parameters. We have proposed a set of sufficient stationarity conditions for autoregressive models of any lag order. In this thesis, a set of new stationarity conditions have been proposed for the AR model. We motivated the new methodology by considering the autoregressive model of AR(2) and AR(3). Additionally, through simulation we studied sufficiency and necessity of the proposed conditions of stationarity. The researcher, additionally draw parameter space of AR(3) model for stationary region of Barndorff-Nielsen and Schou (1973) and our new suggested condition. A new prior distribution has been proposed placed directly on the parameters of the AR(p) model. This is motivated by priors proposed for the AR(1), AR(2),..., AR(6), which take advantage of the range of the AR parameters. We then develop a Metropolis step within Gibbs sampling for estimation. This scheme is illustrated using simulated data, for the AR(2), AR(3) and AR(4) models and extended to models with higher lag order. The thesis compared the new proposed prior distribution with the prior distributions obtained from the correspondence relationship between partial autocorrelations and parameters discussed by Barndorff-Nielsen and Schou (1973).
This document summarizes a research project that used a naive Bayes classifier to analyze comments on an Optional Practical Training program. The researchers collected over 42,000 comments from an online Department of Homeland Security forum and labeled 900 for training and testing a multinomial naive Bayes model. They achieved 96.5% accuracy by further enhancing the model using an iterative classification maximization algorithm. The results found that 85.17% of comments supported the OPT extension while 14.83% opposed it. Additional analysis on ethnicity found Chinese commenters more supportive than Americans.
A Formal Machine Learning or Multi Objective Decision Making System for Deter...Editor IJCATR
Decision-making typically needs the mechanisms to compromise among opposing norms. Once multiple objectives square measure is concerned of machine learning, a vital step is to check the weights of individual objectives to the system-level performance. Determinant, the weights of multi-objectives is associate in analysis method, associated it's been typically treated as a drawback. However, our preliminary investigation has shown that existing methodologies in managing the weights of multi-objectives have some obvious limitations like the determination of weights is treated as one drawback, a result supporting such associate improvement is limited, if associated it will even be unreliable, once knowledge concerning multiple objectives is incomplete like an integrity caused by poor data. The constraints of weights are also mentioned. Variable weights square measure is natural in decision-making processes. Here, we'd like to develop a scientific methodology in determinant variable weights of multi-objectives. The roles of weights in a creative multi-objective decision-making or machine-learning of square measure analyzed, and therefore the weights square measure determined with the help of a standard neural network.
This paper presents a statistical model for predicting the locations of prehistoric mounds in Iowa based on environmental factors. The model uses 335 known mound locations and 640 random non-mound locations to analyze elevation, slope, relief, and distance to water. Histograms and statistical tests show the mound locations differ significantly from non-mound locations on these factors. A logistic regression model found all four environmental variables help explain differences between mound and non-mound areas. When tested on 818 independent mound sites, 72% were in the 10% of Iowa's area deemed most likely to contain mounds by the model, significantly better than chance.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
How Barcodes Can Be Leveraged Within Odoo 17Celine George
In this presentation, we will explore how barcodes can be leveraged within Odoo 17 to streamline our manufacturing processes. We will cover the configuration steps, how to utilize barcodes in different manufacturing scenarios, and the overall benefits of implementing this technology.
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
How to Download & Install Module From the Odoo App Store in Odoo 17Celine George
Custom modules offer the flexibility to extend Odoo's capabilities, address unique requirements, and optimize workflows to align seamlessly with your organization's processes. By leveraging custom modules, businesses can unlock greater efficiency, productivity, and innovation, empowering them to stay competitive in today's dynamic market landscape. In this tutorial, we'll guide you step by step on how to easily download and install modules from the Odoo App Store.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
How to Manage Reception Report in Odoo 17Celine George
A business may deal with both sales and purchases occasionally. They buy things from vendors and then sell them to their customers. Such dealings can be confusing at times. Because multiple clients may inquire about the same product at the same time, after purchasing those products, customers must be assigned to them. Odoo has a tool called Reception Report that can be used to complete this assignment. By enabling this, a reception report comes automatically after confirming a receipt, from which we can assign products to orders.
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
1. www.esri.com ArcUser April—June 2000 45
Weights of Evidence
Weights-of-evidence methodology combines
spatial data from diverse sources to describe
and analyze interactions, provide support for
decision makers, and make predictive models.
The method was originally developed for a
nonspatial application in medical diagnosis. In
this application, the evidence consisted of a
set of symptoms, and the hypothesis was “this
patient has disease x.” For each symptom, a
pair of weights was calculated, one for pres-
ence of the symptom and one for absence of
the symptom. The magnitude of the weights
depended on the measured association between
the symptom and the occurrence of disease in
a large group of patients. The weights could
then be used to estimate the probability that
a new patient would get the disease, based on
the presence or absence of symptoms.
The weights-of-evidence methodology was
adapted for mineral potential mapping with
GIS. The Arc-WofE extension uses the
statistical association between a training points
theme such as a theme showing mineral sites
and evidential themes showing rock types,
geochemistry, or geophysical measurements to
determine the weights.
The weights-of-evidence method is based on
the application of Bayes’ Rule of Probability,
with an assumption of conditional
independence. The model is stated in loglinear
form so the weights from the evidential themes
can be added. The Arc-WofE extension uses
the weights-of-evidence methodology to pro-
duce a response theme. The response theme is
an output map that combines the weights of
predictor variables from the evidential themes
to express the probability that a unit cell
(small unit of area) will contain a training
point. Evidential themes may have categorical
values (e.g., the classes on geology or soil
maps) or ordered values (e.g., geochemical
concentrations or distance to linear and other
spatial objects).
Weight values are easy to interpret. A positive
weight for a particular evidential-theme value
indicates that more training points occur on that
theme than would occur due to chance, whereas
Predictive Probabilistic Modeling
Editor’s Note: The U.S. Geological Survey and the Geological Survey
of Canada, with funding from five mining companies, has developed an
ArcView GIS extension called Arc-WofE that uses a quantitative method
known as weights of evidence for mapping potential mineral sites using
GIS. This article describes the weights-of-evidence method and how it
was used to develop the Arc-WofE extension. See page 48 for a glossary
of the terms used in this article.
By Gary L. Raines, Graeme F. Bonham–Carter, and Laura Kemp
the converse is true for negative weights.
A weight of zero indicates that the training
points are spatially uncorrelated to the theme.
The range-in-weight values for a particular
evidential theme, known as the contrast, gives
an overall measure of how important the
theme is in the model. Uncertainties due
to variances of weights and missing data
allow the relative uncertainty in posterior
probability to be estimated and mapped.
Because conditional independence is never
satisfied completely, the posterior probabilities
are usually overestimated in absolute terms.
However, the relative variations in posterior
probability (as seen in spatial patterns on the
response map) are usually not much affected
by violations of this assumption.
Mapping Quantitative Mineral Potential
The GIS-based mineral potential mapping pro-
cess can be broken down into four main steps.
1. Building a spatial digital database
2. Extracting predictive evidence for a par-
ticular deposit type based on an explora-
tion model
3. Calculating weights for each predictive
map or evidential theme
4. Combining the evidential themes to pre-
dict mineral potential
The Arc-WofE extension assumes that a spa-
tial digital database exists. Step 2 may have
been partially completed prior to using the
Arc-WofE extension but some of Step 2 and
all of Step 3 are done within the extension. The
examination of spatial relationships between
training points and evidential themes is explor-
atory. It yields measurements of spatial asso-
ciation that are often unexpected. The process
of generalization, grouping classes of eviden-
tial themes, and identifying breaks in contin-
uous variables that maximize spatial associa-
tions often leads to a better understanding of
the data. In Step 4 a map is generated that
shows the combined evidence and is valuable
for identifying areas for subsequent explora-
tion. The map also gives a measure of confi-
dence in the results, based on the uncertainty
due to the variances of weights and/or missing
data.
Using ArcView GIS
Extract Contacts creates a grid theme
with a boundary between user-speci-
fied attributes of a polygon shape or
grid theme.
The Buffer Features menu choice oper-
ates on shape or grid themes produc-
ing multiple buffer zones.
Continued on page 46
2. 46 ArcUser April—June2000 www.esri.com
Weights-of-evidence methodology is normally
applied to exploration situations in which there
is an adequate number of mineral deposits or
other located data that can be used as training
points for calculating weights. However, the
extension has been designed so that the user
can calculate weights with training point data.
The training set can be imaginary, and the
overlap relationships between points and the
selected evidential themes is decided arbitrarily
by the user.
This leads to weights governed by expert opin-
ion. In all other respects the treatment of the
problem is the same. If a user decides that
the available training points do not represent
an adequate sample of the deposits actually
present in the area, the weighting of the evi-
dential themes can be based on expert judg-
ment instead of statistical associations. This is
advantageous for developing a variety of sce-
narios for different weights schemes, reflect-
ing the differing opinions of experts.
Creating a Model with Arc-WofE
The Arc-WofE extension was developed by
the U.S. Geological Survey and the Geological
Survey of Canada with funding from the West-
ern Mining Corporation, BHP, Barrick Gold
Corporation, Inco Limited, Placer Dome Inc.,
and PT Freeport Indonesia. The extension was
made available to the public in May 1999. It can
be downloaded at no charge from theArc-WofE
home page (gis.nrcan.gc.ca/software/arcview/
wofe). The Arc-WofE extension requires
ArcView Spatial Analyst as well as ArcView
GIS.
Modeling typically proceeds in three phases—
specification, prediction, and testing.The spec-
ification of the model begins with the definition
of a set of sites where some phenomenon has
been observed such as mineral deposits, earth-
quake epicenters, pack rat middens, or other
point objects. These sites are training points.
The assumption is that a collection of training
points will, in aggregate, have common charac-
teristics that will allow their presence in other
similar sites to be predicted. Other specifica-
tions required for the model include a defined
study area, preparation of data for use in evi-
dential themes, exploration of spatial associa-
tions between potential evidential themes and
training points, and the generalization of evi-
dential themes. The tools provided in the Arc-
WofE extension are particularly valuable for
spatial data exploration and generalization.
Arc-WofE Tools
Loading the Arc-WofE extension modifies the
View GUI to include the Weights-of-Evidence
menu. This menu contains six menu item
groupings. The top four groupings reflect the
normal sequence of operations in using the
Arc-WofE tools.The first grouping includes the
Set Analysis Parameters… and Select Point by
Location... menu choices that are used to select
the study area, training set, and model param-
eters. The second grouping contains tools for
exploring spatial associations between train-
ing points and evidential themes. Choices in
the third grouping specify the model that will
be used to produce the response map. Tools in
the fourth menu grouping are used to check
assumptions and evaluate predictions. Prepro-
cessing tools designed primarily for geological
applications are available from the fifth menu
grouping. The user’s manual for the Arc-WofE
extension, available from the last menu group-
ing, gives descriptions of the use of all menu
choices.
Calculating Weights
Reducing the number of classes, often to just
two classes, improves the statistical robustness
of the weights. However, multiclass evidential
themes are allowable and sometimes desirable.
Three approaches to calculating weights are
possible—categorical, cumulative ascending,
and cumulative descending.
Categorical weights are used for themes where
the attributes values are unordered (e.g., names)
and for themes with ordered values, a case in
which midrange values are most predictive.
Cumulative ascending weights calculations are
used in proximity analysis. In this type of anal-
ysis the zones near the training sites are most
predictive, and the area and number of points
are determined cumulatively starting with the
first buffer zone around a feature, then the first
plus the second, and so on. For each cumu-
lative distance the weights and contrast are
determined, assuming that this distance is a
cut point between binary presence (close to
feature) and absence (far away from feature).
This allows decisions about reclassification to
be based on the resulting statistics. Cumula-
tive descending weights calculations work in
the opposite direction, and are applicable for
themes where the points are mainly associated
with high values (e.g., in defining geochemical
anomalies with high metal concentrations).
Calculated weights tables can be inspected
directly or charted using the Create Charts
menu selection. After considering the stan-
dard deviations of weights, the contrasts in the
tables, and making subjective judgments based
both on statistical results and interpretations
based on real-world knowledge, the evidential
themes can be generalized. The simple rule
for reclassifying an ordered evidential theme
Predictive Probabilistic Modeling Using ArcView GIS
Continued from page 45
The Evidential Theme Weights dialog for
managing and performing weight calcula-
tions.
The Arc-WofE extension adds the Weights of
Evidence menu to the View GUI.
3. www.esri.com ArcUser April—June 2000 47
Hands On
binary “pattern present” and “pattern absent”
is to select the maximum contrast as a thresh-
old. However, in many cases the contrast curve
is not simple due to statistical noise and factors
unrelated to physical processes. In these cases,
consider interpretations that are more complex
or allow multiple classes or discard an eviden-
tial theme that is not acceptable.
Generalizing the Evidential Theme
After considering the weights and deciding on
the appropriate breaks to generalize the eviden-
tial theme, the Generalize Evidential Theme
menu provides several tools that facilitate
this process. The Define Groups tool uses a
query process that takes advantage of the stan-
dard ArcView GIS query tool for generalizing
a theme. The generalization created can be
inspected by symbolizing the evidential theme
with the new reclassification or its descriptor
field. If the pattern is acceptable, then general-
ization is complete. If it is not acceptable, then
alternative generalizations can be generated and
stored. New themes are not produced—the pro-
cess simply adds new fields to the theme’s attri-
bute table. This spatial data exploration and
generalization step is repeated for each theme
that will be used as evidence. This process com-
pletes the specification of the model.
Creating a Response Theme
After specifying the model, the user selects
the evidential themes for prediction using the
Calculate Response menu. For each eviden-
tial theme, the attribute field containing the
desired generalization is selected. The com-
plete weights-of-evidence calculations create a
response theme, symbolized by posterior prob-
ability values (default), that is added to the
view. Although in previous operations point
evidential themes could be either grids or fea-
tures, the response theme is a grid. It contains
the intersection of all of the input themes in
a single integer theme. Each row of the attri-
bute table contains a unique vector (tuple) of
evidential-theme values, the number of train-
ing points, area in unit cells, sum of weights,
posterior logit, posterior probability, and the
measures of uncertainty. The variances of the
weights and variance due to missing data are
summed to give the total variance of the pos-
terior probability. The response theme may be
symbolized by any of the fields in the attribute
table. Various measures to test the conditional
independence assumption are also reported.
Results are stored as tables in .dbf format for
later inspection.
Once an acceptable model is created it can be
tested in a variety of ways. If pairwise condi-
tional independence tests indicate that some
theme pairs are strongly correlated, one theme
should be discarded or they should be com-
bined. If overall conditional independence is a
significant problem, the posterior probabilities
should be thought of as relative favorabilities.
The user can associate the calculated response
values with both the training and test point
subsets as another evaluation of the model. The
user will want to select meaningful categories
of posterior probabilities for symbolization of
the response theme. The Arc-WofE extension
provides several tools to help define meaning-
ful symbolization that is easily interpreted.
Thisextensionalsoprovidesanexpertapproach
to weighting. This approach can be used when
no training points are available or to modify the
calculated weights using expert judgment after
the data-driven model is created. The details of
using the expert weighting are explained in the
user’s manual.
Concluding Remarks
Use of the Arc-WofE extension with various
types of minerals has successfully predicted the
location of these deposits.There has been a high
degree of agreement between response maps
and expert-opinion maps. The evolution of
this extension continues with further enhance-
ments to the weights-of-evidence method and
the addition of logistic regression, fuzzy logic,
and neural network modeling tools. A new
enhanced extension, supported by a consor-
tium of industry and government organiza-
tions, has been completed and will be in the
public domain in January 2001.
About the Authors
Gary L. Raines is a research geologist for the
U.S. Geological Survey in Reno, Nevada. His
research focuses on the integration of geosci-
ence information for predictive modeling in
mineral resource and environmental applica-
tions.
Graeme F. Bonham–Carter is a research geolo-
gist working in the Mineral Deposits Division
of the Geological Survey of Canada, Ottawa.
He is interested in applications of GIS to min-
eral exploration and environmental problems.
He is editor-in-chief of Computers & Geosci-
ences, a journal devoted to all aspects of com-
puting in the geosciences.
Laura Kemp is an Ottawa, Ontario, based
GIS technologist specializing in ArcView GIS
application development.
The Define Groups tool uses a query process
that takes advantage of the standard ArcView
GIS query tool for generalizing a theme.
The extension supplies various methods for
generalizing evidential themes.
A response map showing five classes of pos-
terior probability defined by natural breaks
shown with and without a confidence mask.
See page 48 for a glossary of terms
relating to this article.
For a list of references, including
books and papers by the authors,
visit ArcUser Online.
4. Hands On
described by a W+ and W- pair, assuming
presence/absence of this class versus all other
classes. Positive weights indicate that more
points occur on the class than due to chance,
and the inverse for negative weights. The
weight for missing data is zero. Weights are
approximately equal to the proportion of train-
ing points on a theme class divided by the
proportion of the study area occupied by theme
class, approaching this value for an infinitely
small unit cell.
Contrast—W+ minus W-, which is an overall measure
of the spatial association of an evidential
theme with the training points.
Unit Cell—A small unit of area used for counting. Each
training point is assumed to occupy a unit cell.
All area measures are transformed to unit cells.
The number of unit cells with training points
is unaffected by changes in unit cell size;
whereas, area measurements are affected. The
unit cell should be small—normally smaller
than the minimum resolution of evidential
themes. Weight values are relatively insensitive
to unit cell if unit cell is small.
Prior Probability—The probability that a unit cell con-
tains a training point before considering the
evidential themes. Normally it is assumed to
be a constant over the study area equal to the
training point density (total number of training
points divided by total study area in unit cells).
Posterior Probability—The probability that a unit cell
contains a training point after consideration
of the evidential themes. This measurement
changes from location to location depending
on the values of the evidence, being larger
than the prior where the sum of the weights
is positive. The mathematical model is loglin-
ear in form and uses a log-odds (logit) scale
that is transformed to probability. The posterior
logit is determined from the sum of the prior
logit and the applicable weights of evidential
themes.
Confidence—A measure based on the ratio of posterior
probability to its estimated standard deviation
that is used as an informal test of whether the
posterior probability is greater than zero.
Conditional Independence (CI)—The Arc-WofE model
assumes that evidential themes are indepen-
dent, not correlated conditional on the loca-
tions of training points. If this assumption is
true, the summation of area times predicted
posterior probability over the study area should
equal the number of training points. The CI
ratio, the number of observed training points
divided by the number of predicted points,
gives one measure of violation of this assump-
tion.
Predictive Probabilistic Modeling
Using ArcView GIS
Model—The characteristics of a set of training points
(e.g., mineral deposits), and the relationships of
the training points to a collection of evidential
themes (e.g., exploration data sets).
Study Area—A grid theme that acts as a mask to define
the area where the model is developed and
applied. It may be irregular in outline and may
contain interior holes.
Training Points—The set of spatial point objects whose
locations are to be predicted. In mineral explo-
ration, these are the sites of known mineral
deposits. Points are either present or absent.
Size or other attributes of these points are not
modeled.
Evidential Theme—A theme whose attributes are associ-
ated with the occurrence of the training points.
Missing Data—Values of an evidential theme within
the study area that are unknown. For example,
areas containing younger rocks that conceal the
rocks of interest may be treated as unknown in
a geological theme.
Weight—A measure of an evidential-theme class (value)
as a predictor of training points. A weight is
calculated for each theme class. For binary
themes, these are often labeled W+ and W-.
For multiclass themes, each class can also be
Coming to Terms
48 ArcUser April—June2000 www.esri.com
A glossary of terms relating to predictive probabilistic
modeling as described in the previous article