Transcript of "./publications/archive/Turner2002.doc"
SPIN!-project Working Paper
State-of-the-Art Geographical Data Mining
The views expressed in this paper are those of the author and do not necessarily
reflect those of the SPIN!-project consortium.
This SPIN!-project working paper was drafted in December 2002 to provide an
assessment of the state-of-the-art in geographical data mining (GDM). The paper
contends that the spatial data mining system for data of public interest developed
during the SPIN!-project (SPIN) is a prototype GDM system which is state-of-the-art
in terms of its architecture and functionality.
Two years ago a SPIN!-project report was compiled that assessed the state-of-the-art
in Exploratory Spatial Data Analysis (Turner, 2000). Since then the state-of-the-art
has not progressed far. Although most geographical software that has been developed
during this time has looked to take advantage of lower level improvements, and some
software features have been enhanced and additional functionality has been
incorporated, on the whole there have been no major breakthroughs. This opinion is
based on experience and a careful examination of relevant literature rather than a
large scale practical and critical evaluation. This is unfortunate, but a consequence of
the restricted availability of resources. Hopefully, the subjective assessment of the
state-of-the-art that this paper offers will be useful. It aims to provide a reference for
further research and highlights some developments that are likely to play an important
role in developing future state-of-the-art GDM systems.
Additionally, this working paper details a crucial difference in the meanings of
clustering terms as used in data mining and geography.
1. Background and Introduction
Available computational power has increased at an accelerating exponential rate
(URL 1). The increasing power comes from faster and more connected processors
and memory, larger and more organised memory banks, and more efficient operating
systems and software. As computational power increases and means for human
computer interaction evolves there arise new opportunities for the analysis of the vast
amounts of geographical data that are collected. The human and computer resources
available for software development are immense yet the available software for GDM
does not readily facilitate the processing of the massive volumes of geographical data
into more useful information that can be analysed in highly automated ways so as to
develop our understanding of geographical phenomenon.
Open source development and the Internet are coming of age. This is coinciding with
trends to modularise, develop well specified libraries of functionality and adopt and
develop standards. The data analysis capabilities of GIS, DMS, mathematical and
statistical packages, and other bespoke tools are continually being enhanced. Cross-
fertilisation is occurring whereby methods developed in one type of data analysis
software are being adapted and incorporated in others. More concise functional
toolkits for specialist applications can be more readily assembled. Methods are also
becoming more robust and packages more interoperable.
The spatial data mining system for data of public interest being built in the SPIN!-
project (SPIN) is an attempt to integrate geographical information system (GIS) and
data mining system (DMS) functionality in an open and extensible way. In essence
SPIN is a GDM system and any such system will have the following features:
1. a database - for storing, querying and retrieving data;
2. a graphical user interface (GUI) - for display and for interacting with and
developing functionality; and,
3. a suite of GDM analysis and GeoVisualisation methods.
Until recently most DMS were backed by databases which had special handlers for
temporal references, but had no special handlers for spatial references and so could
only readily treat them in the same way as general attribute data. Until recently most
GIS were backed by databases which had special handlers for spatial references, but
had no special handlers for temporal references and so could only readily treat them in
the same way as general attribute data. Databases are standardising whether they are
proprietary databases for backing GIS and DMS or not. Most database software being
developed are geared to handle complex data types (e.g. geographical data) that may
have both spatial and temporal references. Open source GDM systems developed in
the near future are likely to be based on postgreSQL and mySQL (URL 2, URL 3).
Display environments that facilitate interactive and dynamic linking of data plots,
graphs, tables, maps etc., and which enable animation are likely to be standard in the
next generation of GDM systems. Along with interactive display functionality, GUIs
are likely to have visual programming workspaces which can be used to develop
functions which in turn can be encapsulated and added to the widgets (buttons, menus
etc.) and made accessible to command line interfaces and for scripting.
Arguably a GeoVisualisation system can be based solely on the first two features as
described above. What is needed for a GDM system are the additional classification,
generalisation, cluster detection and inductive analysis tools geared for analysing
large volumes of geographical data.
This section ends by listing the aims of the SPIN!-project:
• To develop an integrated, interactive, internet-enabled spatial data mining
system for data of public interest.
• To improve knowledge discovery by providing an enhanced capability to
visualise data mining results in spatial, temporal and attribute dimensions.
• To develop new and integrated ways of revealing complex patterns in spatio-
temporally referenced data that were previously undiscovered using existing
• To enhance decision making capabilities by developing interactive GIS
techniques, which provide an integrated exploratory and statistical basis for
investigating spatial patterns.
• To deepen the understanding of spatio-temporal patterns by visual simulation.
• To publish and disseminate geographical data mining services over the
2. State-of-the-art ESDA at the beginning of the SPIN!-project circa 2000
This section looks back to the state-of-the-art in exploratory spatial data analysis
(ESDA) as it was described by Turner (2000).
At the beginning of the year 2000 most GIS offered:
• no inductive analysis methods;
• only very limited statistical functionality;
• only basic mathematical operations that would work on tables of data; and,
• no linked dynamic display environments.
With the exception of network analysis, most GIS were limited to fairly basic forms
of spatial analysis. However, it was noted that a new generation of GIS was emerging
which provided linked dynamic display environments (e.g. CDV (Dykes 1997, 2002);
Descartes (Andrienko and Andrienko, 1999); GeoVista Studio (Gahegan et al., 2000).
These systems are now commonly referred to as GeoVisualisation software although
some may reasonably claim to be GDM systems.
GIS were becoming more modular with extensions being developed that could be
plugged in to enhance a core of functionality. The modular architectures came hand
in hand with the establishment of larger communities of users working together by
sharing scripts and indeed playing a part in the development of extensions married to
their specialist requirements.
The OGC had begun delivering spatial interface specifications that were to encourage
interoperability between technologies that use geographical information. For
developers of open source GIS and the academic community in general, it lead to the
establishment of many open source projects and the provision of much freeware GIS.
Details on much of this can be obtained via the Free GIS web site (URL 4).
In 2000 there were numerous data mining systems (DMS) that had developed largely
unconnected to GIS research. These offered:
• most multivariate statistical methods including linear and logistic regression;
• various classification and modelling tools, such as decision trees, rule
induction methods, neural networks, memory or case based reasoning and k-
means and other so called clustering methods; and,
• sequence discovery for finding patterns in time series data.
It had been shown that many of these methods (neural networks in particular) could
be usefully applied to geographical data (Openshaw and Openshaw, 1997; Openshaw
and Abrahart, 2000). However, GIS contained little or none of this functionality.
Despite their powerful flexible and user friendly toolkits, even the most advanced
DMS had no mechanism for handling location or spatial aggregation or for coping
with spatial entities or spatial concepts such as a region or circle or distance or map
topology. Though the need to support so-called non-standard data types including
multi-media data, images, audio and so forth which contain special patterns had been
recognised, (Goebel and Gruenwald, 1999).
There was (and arguably still is) a considerable amount of hype about the capabilities
and functionality of DMS. Vendors argue that no credible evaluation of their
software can be performed by evaluators that have not undertaken (often expensive)
training courses. Furthermore the systems were (and many still are) tuned to work on
very specific hardware with specially configured operating systems underlying them.
Goebel and Gruenwald (1999) offered an intelligible survey of data mining and
knowledge discovery software tools and noted the need for seamless integration with
databases so that algorithms are scalable and do not rely on having all data in fast
access memory. It was observed that many systems relied on querying an underlying
database (whose data is stored in large but slow memory banks) and holding the
resulting data from the query in the fast access memory banks (RAM, real memory or
“database engine”) in order to run many of the data mining methods. This often
meant that the methods worked poorly with large volumes of data and were not really
as scaleable as data mining often demands.
In 2000 most ESDA tools were based on interactive graphics. The Geographical
Analysis Machine GAM/K was an exception and was already planned to be integrated
as a key component of SPIN. GAM/K is detailed by Openshaw (1995). More
recently, Openshaw (1998a) also detailed a Geographical Explanations Machine
(GEM) which like GAM/K is based on the concept of identifying clusters. It operates
similarly to GAM/K in that data from overlapping circular regions is analysed across
a range of different scales and the result is output as a map.
In addition to GAM/K and GEM there were a number of other cluster detection
methods. These have been detailed in other SPIN!-project reports and those with
merit are included in SPIN. Back in 2000 it was also argued that SPIN should contain
a geographically weighted regression method which could be used to examine the
geographical variation in regression model parameter estimates (Fotheringham et al.
2002; Brunsdon et al. 1996). A case was also made for including local indicators of
spatial association methods and a couple of other spatial statistical methods.
Shortly before the SPIN!-project began Openshaw (1999) outlined the key design
issues for developing GDM systems and highlighted the need to respect the special
features of geographical information:
• it is spatially referenced
• it is often temporally referenced
• observations are not independent
• data uncertainty and errors tend to be spatially structured
• spatial coverage is rarely global
• non-stationarity is to be expected
• relationships are often geographically localised
• non-linearity is the norm
• data distributions are usually non-normal
• there are often many variables but much redundancy
• there are often many missing values
• spatial, temporal and spatio-temporal clustering is important
• data can be aggregated and disaggregated in space and time and in space-time
Turner (2000) noted that the geocyberspace contained increasing amounts of 3D
spatial geographical information and whilst global coverage was rare some datasets
did exist and were increasingly being used for a wide range of environmental
applications. It was argued that functionality to analyse and visualise 3D spatial
information would be of great utility for the SPIN!-project especially for a seismic
application in order to examine the relation between earthquake occurrences, crustal
stresses and tidal patterns.
It is appreciated that there are many ways to represent spatial, temporal and spatio-
temporal data, and that the most appropriate representation depends on what
information is required and/or what phenomenon/process is under study. In many
cases topographic maps, time maps and cartograms provide more useful visualisations
than Euclidean maps (Orford et al., 1998). The ability to develop such
representations in SPIN was discussed but much of the functionality has not been
In addition to being able to work with 3D spatial data with temporal references and
alternatives to Euclidean representations, it is necessary to develop methods that:
• can handle no-data by attempting to minimise assumptions of what the value
of data would be;
• are scalable and can cope with very large datasets; and,
• are robust and precise so as to handle the potentially large ranges of numbers
and detail involved.
3. The state-of-the-art in GDM
There are a number of summaries regarding the challenges and achievements of GDM
research, (see for example Buttenfield et. al., 2001; Yuan et al., 2001; Openshaw and
Abrahart, 2000; Openshaw, 1999). These agree that the development of data mining
and knowledge discovery tools for geographical use must be fundamentally based on
using and coping with the special features of geographical information as listed in the
SPIN integrates GIS and spatial data mining functionality in an open and extensible
way using a client server architecture based on Enterprise JavaBeans, (May and
Savinov (2002). Enterprise JavaBeans (EJB) offers a specification and guidelines for
using Java Remote Method Invocation (RMI) and JDBC (URL ref). Java RMI
enables elements of Java programs to take some actions on other Java elements on
remote machines (URL ref). JDBC is the set of interfaces for connecting to using
database software (URL ref). The SPIN architecture was adopted at an early stage of
the SPIN!-project and has also been adopted by other similar systems (Bertolotto et
al., 2001; Takatsuka and Gahegan, 2002). Since the start of the SPIN!-project the
EJB specification has been revised and among other enhancements version 2.1 has
added support for web services. EJB offers a state-of-the-art architecture for
developing a GDM system.
SPIN offers state-of-the-art GeoVisualisation functionality derived from Descartes
and CommonGIS (Voss et al., 2002). GeoVista Studio developed mainly by a
research team based in the USA offers similar functionality (Gahegan et al., 2000;
Takatsuka, 2002; Takatsuka and Gahegan, 2002).
GeoVista Studio has a useful and convenient deployment mechanism via Java Web
start (Sun, 2002b) and can readily be customised and packaged up as an applet that
can be run in most web browsers. It is moving towards being open source and is
moving to base itself on GT2. GT2 is an open source, Java GIS toolkit for developing
standards compliant solutions. It aims to support Open GIS and other relevant
standards as they are developed.
4. Differences in clustering terminology
Research into developing and testing methods of detecting and measuring clustering
in space and over time is being undertaken in epidemiology, criminology, geography,
data mining and science in general. Unfortunately, there are different interpretations
of what a cluster is and confusion about the terms clustering, spatial cluster and
In the field of data mining, the term cluster is generally used to mean things which
share similar characteristics or attributes like classes, sets or groups; and clustering is
a term reserved for the process of classifying, sub-setting or grouping a set of things.
It differs from classification in that the number of clusters is not pre-specified. A
spatial cluster in data mining is a term which has been used for a collection of spatial
objects with similar locations in space irrespective of their other attributes or
characteristics. The term spatial clustering has thus been reserved for the process of
classifying or grouping spatial objects into spatial clusters without apriori specifying
how many spatial clusters there are (Murray, 1997; Estivill-Castro and Houle, 1999;
Han et. al., 2001).
Although the above offers a reasonable interpretation, it is not until comparatively
recently that researchers in the data mining field have begun to consider clustering
and spatial clustering in geographical data. This has lead to a clash in terminology,
thus it is important to note that:
• spatial clustering in data mining pays no attention to the attributes associated
with spatial location;
• geographic data is of a higher dimensionality and is not only a set of locations,
but comprises a set of measured attributes which may include temporal
The simplest way of defining a cluster (as used in epidemiology and much
geography) is as a localised excess incidence rate that is unusual in that there is more
of some variable than might be expected. For example, a cluster could be:
• a local excess disease rate; an unusual crime rate (hot spots);
• an unusual unemployment rate or road traffic accident rate (black spot);
• a region of unusually high positive residuals from a model;
• an unusual concentration of plant species or earthquake epicentres, etc.
Virtually any variable that has a spatial distribution will contain some degree of
pattern or concentration. Clusters are where and when these concentrations are
extreme. There are essentially two types of clusters. Clusters of excess, and clusters
of deficit. The former refers to unusually high concentrations of some rate and the
later refers to unusually low concentrations of some rate.
It is very important to appreciate the difference between the interpretations of these
terms. Likewise it is important that GDM methods deal with the special nature of
5. The future state-of-the-art in GDM
There is demand for GDM tools:
• for analysing geographical data all the time and as they are produced;
• that analyse patterns in the spatial locations and attributes of geographical
objects over time;
• that search for spatial correlation and autocorrelation
• that work across a range of spatial scales
The spatial data mining system for data of public interest being developed in the
SPIN!-project (SPIN) is an attempt to integrate state-of-the-art GIS and data mining
functionality in an open and extensible way. The result is effectively a prototype
GDM system which is arguably the most complete example of its kind. Its range of
functionality is very broad and it has many advanced capabilities, however much can
be done to improve it.
SPIN has not been subject to much user testing and has not been scientifically
evaluated. Is it really user-friendly? If there are patterns hidden in spatial data or
indeed relationships encrypted in spatial information then can SPIN help users to find
them by at least pointing them in the right direction?
SPIN is not really open because it is not open source. If the SPIN!-project comes to
an end then what will happen to it? Is SPIN only to be available to the SPIN!-project
consortium for research purposes? Will it not evolve into an open source project and
become openly distributed?
Currently some algorithms in SPIN are dependent on using Oracle database software.
Would it not be better if it could optionally use one of the open source databases as
In general there is a need for new analysis methods and more example applications
that show how GDM tools can be used by presenting novel results that are meaningful
in a clear and understandable way.
References and Bibliography
Andrienko G., Andrienko N. (1999) Interactive Maps for Visual Data Exploration.
In the International Journal of Geographical Information Science 13 (4) 355-374.
Adhikary J. (1996) Knowledge Discovery in Spatial Databases: Progress and
Challenges. Paper presented at an Association for Computing Machinery
Workshop on Research Issues on Data Mining and Knowledge Discovery,
Alexander F., Boyle P. (2000) Do cancers cluster? In Eliott P., Wakefield J., Best N.,
Briggs D. (eds.) Spatial Epidemiology: Methods and Applications. (Oxford
Alexander F., Boyle P. (1996) Methods for Investigating Localised Clustering of
Disease. International Agency for Research on Cancer scientific publication 135.
Andrienko N., Andrienko G. (2001) Intelligent Support for Geographic Data Analysis
and Decision Making in the Web. Geographical Information and Decision
Analysis 5 (2) 115-128.
Bertolotto M., McGeown L., Carswell J., McMahon J (2001) e-SpatialTM Technology
for Spatial Analysis and Decision Making in Web-Based Land Information
Management Systems. In Journal of Geographic Information and Decision
Analysis 5 (2) 95-114.
Bivand R. (2002) Implementing spatial data analysis software tools in R*. Paper
presented at a Center for Spatially Integrated Social Science Specialist Meeting on
Spatial Data Analysis Software Tools, USA.
Brunsdon C., Fotheringham S., Charlton M. (1996) Geographically Weighted
Regression: A method for exploring non-stationarity. In Geographical Analysis,
28 (4), 281-298.
Brunsdon C., MacGill J., Openshaw S., Turner A., Turton I., (1999) Testing space-
time and more complex hyperspace geographical analysis tools. Paper presented
at the 7th GISRUK conference, UK.
Buttenfield B., Gahegan M., Miller H., Yuan M. (2001) Geospatial Data Mining and
Knowledge Discovery. A UCGIS White Paper on Emergent Research Themes
submitted to UCGIS Research Committee.
Câmara G., Neves M., Monteiro A (2002) SPRING and TerraLib: Integrating Spatial
Analysis and GIS. Paper presented at Center for Spatially Integrated Social
Science Specialist Meeting on Spatial Data Analysis Software Tools, USA.
Carr D., Chen J., Bell S., Pickle L., Zhang Y., (2002) Interactive Linked Micromap
Plots and Dynamically Conditioned Choropleth Maps. Paper presented at the
Center for Spatially Integrated Social Science Specialist Meeting on Spatial Data
Analysis Software Tools, USA.
Diggle P. (2000) Overview of statistical methods for disease mapping and its
relationship to cluster detection. In Eliott P., Wakefield J., Best N., Briggs D.
(eds.) Spatial Epidemiology: Methods and Applications. (Oxford University
Dykes J. (2002) Developing Tools for GeoVisualization Research. Paper presented at
Center for Spatially Integrated Social Science Specialist Meeting on Spatial Data
Analysis Software Tools, USA.
Dykes J. (1997) Exploring spatial data representations with dynamic graphics. In
Computers & Geosciences 23 (4), 347-370.
Edsall R. (1999) Tools for the Exploration and Multivariate Classification of Large
Geographical Databases. Paper presented at the 95th Annual Meeting of the
Association of American Geographers, USA.
Edsall R., Roedler A. (2002) An Enhanced GIS Environment for Multivariate
Exploration: a Linked Parallel Coordinate Plot Applied to Urban Greenway Use
Survey Data. Paper presented at the Center for Spatially Integrated Social Science
Specialist Meeting on Spatial Data Analysis Software Tools, USA.
Estivill-Castro V. (2002) Why so many clustering algorithms – A position paper. In
SIGKDD explorations newsletter of the ACM Special Interest Group on
Knowledge Discovery in Data and Data Mining 4 (1) 65-75.
Estivill-Castro V., Houle M. (1999) Robust Distance-Based Clustering with
Applications to Spatial Data Mining. Algorithmica 30 (2) 216-242.
Estivill-Castro V., Lee I. (2001) Data Mining Techniques for Autonomous
Exploration of Large Volumes of Geo-referenced Crime Data. Paper presented at
the 6th International Conference on GeoComputation, Australia.
Fotheringham S., Brunsdon C, Charlton M. (2002) Geographically Weighted
Regression (Wiley, Chichester).
Fotheringham S., Brunsdon C, Charlton M. (2000) Quantitative Geography:
Perspectives on Spatial Analysis. (Sage, London).
Fotheringham S., Charlton M. (1994) GIS and exploratory data analysis: An overview
of some basic research issues. In Geographical Systems, 1 (4), 315-327.
Fotheringham S., Rogerson P. (1994) Spatial Analysis and GIS. (Taylor & Francis,
Fulcher C., Barnett Y., Barnett C. (2002) Spatial Analysis Software for Community
Decision Support. Paper presented at a Center for Spatially Integrated Social
Science Specialist Meeting on Spatial Data Analysis Software Tools, USA.
Gahegan M. (2001) Data mining and knowledge discovery in the geographical
domain. White Paper: National Academies Computer Science and
Telecommunications Board. (Intersection of Geospatial Information and IT
content and Knowledge distillation)
Gahegan M. (2000) On the application of inductive machine learning tools to
geographical analysis. In Geographical Analysis 32 (2), 113-139.
Gahegan M., Miller H., Yuan M. (2001) Geospatial Data Mining and Knowledge
Discovery. http://www.ucgis.org/emerging/gkd.pdf (Accessed in December 2002)
Gahegan M., Takatsuka M., Wheeler M., Hardisty F. (2000a) GeoVISTA Studio: a
geocomputational workbench. Paper presented at Geocomputation 2000, UK.
Gahegan M., Wachowicz M., Harrower M., Rhyne T-M. (2000b) The Integration of
Geographic Visualization with Knowledge Discovery in Databases and
Geocomputation. In Cartography and Geographical Information Systems (special
issue on the International Cartographic Association research agenda)
Goebel M., Gruenwald L. (1999) A survey of Data Mining and Knowledge Discovery
Software Tools. In SIGKDD Explorations 1 (1) 20-33.
Han J., Kamber M., Tung A. K. H. (2001) Spatial Clustering Methods in Data
Mining: A Survey. In H. Miller and J. Han (eds.) Geographic Data Mining and
Knowledge Discovery (Taylor & Francis, London).
Hewitson B., Crane R. (eds.), 1994, Neural Nets: Applications in Geography (Kluwer,
Indulska M., Orlowska E. (2002) Gravity Based Spatial Clustering. Paper presented
at the 10th Association for Computing Machinery International Symposium on
Advances in Geographical Information Systems, USA.
Krivoruchko K. (2002) Bridging the Gap Between GIS and Solid Spatial Statistics.
Paper presented at a Center for Spatially Integrated Social Science Specialist
Meeting on Spatial Data Analysis Software Tools, USA.
Koperski K., Adhikary J., Han J. (1996) Spatial Data Mining: Progress and
Challenges Survey paper. Paper presented at an Association for Computing
Machinery Workshop on Research Issues on Data Mining and Knowledge
Koperski K., Han J., Adhikary J. (1998) Mining Knowledge in Geographical Data.
ftp://ftp.fas.sfu.ca/pub/cs/han/pdf/geo_survey98.pdf (Accessed December 2002).
Lawson A. (1999) A Review of Cluster Detection Methods. In Lawson A., Biggeri
A., Böhning D., Lesaffre E., Viel J-F., Bertollini R. (eds.) Disease Mapping and
Risk Assessment for Public Health. (Wiley, Chichester)
Lazarevic A., Fiez T., Obradovic Z. (2000) A Software System for Spatial Data
Analysis and Modeling. Paper presented at the 33rd International Conference on
Systems Sciences, USA.
Levine N. (1998) Hot Spot Analysis Using both the SYSTAT K-Means Routine and a
Risk Assessment. Paper presented at the Academy of Criminal Justice Sciences
Annual Conference, USA.
MacEachren A., Hardisty F., Gahegan M., Wheeler M., Dai X., Guo D., Takatsuka M.
(2001) Supporting visual integration and analysis of geospatially-referenced data
through web-deployable, cross-platform tools. Paper presented at 20th
International Cartographic Conference, China.
Masters R., Edsall R. (2000) Interaction Tools to Support Knowledge Discovery: A
Case Study Using Data Explorer and Tcl/Tk. Paper presented at the Visualization
Development Environments workshop, New Jersey, USA.
May M., Savinov A. (2002) An integrated platform for spatial data mining and
interactive visual analysis. Paper presented at the 3rd International Conference on
Data Mining Methods and Databases for Engineering, Finance and Other Fields,
Miller H., Wentz E. (2002) Geographic Information Systems and Spatial Analysis:
Enhancing Analytical Capabilities by Expanding Geographic Representations.
http://www.geog.utah.edu/~hmiller/research.html (Accessed in December 2002).
Miller H., Han J. (2000) Discovering Geographic Knowledge in Data Rich
Environments: A Report on a Specialist Meeting. In SIGKDD Explorations 1 (2),
Openshaw S. (1999) Geographical data mining: key design issues. Paper presented at
the 4th International Conference on GeoComputation, USA.
Openshaw S. (1998) Building automated Geographical Analysis and Exploration
Machines. In Geocomputation: A primer, Longley P., Brooks S., Mcdonnell B.
(eds.) (Macmillan Wiley, Chichester) 95-115.
Openshaw S. (1995) Developing automated and smart spatial pattern exploration tools
for geographical information systems applications. The Statistician 44 (1), 3-16.
Openshaw S., Abrahart R. (eds.), 2000, GeoComputation (Taylor & Francis, London)
Openshaw S., Charlton M., Wymer C. and Craft A.W. (1987). A mark I geographical
analysis machine for the automated analysis of point data sets. International
Journal of Geographical Information Systems, 1, 335-358.
Openshaw S., Openshaw C. (1997) Artificial Intelligence in geography (Wiley,
(…details incomplete) Openshaw S., Turton I. (1999) Using a Geographical
Explanations Machine to Analyse Spatial Factors Relating to Primary School
Performance. In Geographical & Environmental Modelling.
Openshaw S., Turton I. (1998) Application of GAM to Crime Analysis Data. Paper
presented at the Academy of Criminal Justice Sciences Annual Conference, USA.
Openshaw S., Turton I., MacGill J. (1999) Using the Geographical Analysis Machine
to Analyse Limiting Long Term Illness. In Geographical & Environmental
Modelling 3 83-99.
Orford S., Dorling D., Harris R (1998) Review of Visualization in the Social
Sciences: A State of the Art Survey and Report. Report for the Advisory Group
on Computer Graphics.
Pacheco B. (2001) Assessing the Applicability and Usability of GeoVISTA Studio for
Health Geographics. Published in: The Pennsylvania State University Summer
Research Opportunities Program Journal 9 221-233.
Paddenburg A., Wachowicz M. (2001) The Effect of Spatial Generalisation On
Filtering Noise For Spatio-Temporal Analyses. Paper presented at the 6th
International Conference on GeoComputation, Australia.
Roddick J., Spiliopoulou M. (2002) A survey of Temporal Knowledge Discovery
Paradigms and Methods. IEEE Transactions on Knowledge and Data Engineering
14 (4) 750-767.
Roddick J., Hornsby K., Spiliopoulou M. (2001) Temporal, Spatial and Spatio-
Temporal Data Mining Research and Knowledge Discovery Research
Bibliography. http://kdm.first.flinders.edu.au/IDM/STDMBib.html (Accessed in
Roddick J., Lees B. (2001) Paradigms for Spatial and Spatio-Temporal Data Mining.
In Miller H., Han J. (eds.) Geographic Data Mining and Knowledge Discovery
(Taylor & Francis, London).
Sun (2002a) Enterprise JavaBeansTM Specification, Version 2.1.
http://java.sun.com/products/ejb/docs.html (Accessed in December 2002).
Sun (2002b) JavaTM Web Start http://java.sun.com/products/javawebstart/ (Accessed
in December 2002).
Shekhar S., Vatsavai R. (2002) Spatial Data Mining Research by the Spatial Database
Research Group, University of Minnesota. Paper presented at a National Science
Foundation workshop on Spatio-temporal Data Models for Biophysical Fields,
Shekhar S., Huang Y., Wu W., Lu C., Chawla S. (2001) What’s Spatial About Spatial
Data Mining: Three Case Studies. In Kumar V., Grossman R., Kamath C.,
Namburu R. (eds.) Data Mining for Scientific and Engineering Applications
Symanzik J., Swayne D., Lang D., Cook d. (2002) Software Integration for
Multivariate Exploratory Spatial Data Analysis. Paper presented at a Center for
Spatially Integrated Social Science Specialist Meeting on Spatial Data Analysis
Software Tools, USA.
Takatsuka M. (2002) An Open Component-Oriented Visual Programming
Environment for Integrating Geospatial Data Analysis and Visualization Tools.
Paper presented at the Center for Spatially Integrated Social Science Specialist
Meeting on Spatial Data Analysis Software Tools, USA.
Takatsuka M., Gahegan M. (2002) GeoVista Studio: A Codeless Visual Programming
Environment for Geoscientific Data Analysis and Visualization. To appear in
Computers & Geosciences 28 (10) 1131-1144.
Tango T. (1999) Comparison of General Tests for Spatial Clustering. In Lawson A.,
Biggeri A., Böhning D., Lesaffre E., Viel J-F., Bertollini R. (eds.) Disease
Mapping and Risk Assessment for Public Health. (Wiley, Chichester)
Tung A., Hou J., Han J. (2001) Spatial Clustering in the Presence of Obstacles. Paper
presented at the 17th International Conference on Data Engineering, Germany.
Turner A. (2000) State-of-the-art Exploratory Spatial Data Analysis. SPIN!-project
Turner A., Turton I., Walder A. (2000) Evaluation of Applied Statistical Cluster
Detection Methods. SPIN!-project report Deliverable 5.3.
Turton I., Walder A. (2002) Algorithms for Investigation of Temporal Change.
SPIN!-project report Deliverable 6.4.
Turton I., Walder A. (2000) Why Spatial Pattern Detection is Harder than it Looks.
Paper presented at the 5th International conference on GeoComputation.
Voss H., Andrienko N., Andrienko G., Gatalsky P. (2001) Web-based Spatio-
temporal Presentation and Analysis of Thematic Maps. In the Cities and Regions
Journal of the Standing Committee on Regional and Urban Statistics and
Voss H., Andrienko N., Andrienko G. (2002) Exploratory Data Analysis and Decision
Making with Descartes and CommonGIS. Paper presented at a Center for
Spatially Integrated Social Science Specialist Meeting on Spatial Data Analysis
Software Tools, USA.
Wakefield J., Kelsall J., Morris S (2000) Clustering, cluster detection and spatial
variation in risk. In Eliott P., Wakefield J., Best N., Briggs D. (eds.) Spatial
Epidemiology: Methods and Applications. (Oxford University Press).
Wu Y-H., Miller H. (Forthcoming) Computational Tools for Measuring Space-Time
Accessibility within Transportation Networks with Dynamic Flow. In Journal of
Transportation and Statistics special issue on accessibility 4 (2/3) 1-14.
The following literature was not acquired and read although it was wanted:
Andrienko N., Andrienko G., Savinov A., Voss H., Wettschereck D. (2001)
Exploratory Analysis of Spatial Data Using Interactive Maps and Data Mining. In
Cartography and Geographic Information Science 28 (3) 151-165.
Edsall R. (1999) Development of Interactive Tools for the exploration of Large
Geographic Databases. Paper presented at the 19th International Cartographic
MacEachren A., Wachowicz M., Edsall R., Haug D., Masters R. (1999) Constructing
knowledge from multivariate spatiotemporal data: Integrating Geographic
Visualization with Knowledge Discovery in Database Methods. In a special issue
of the International Journal of Geographical Information Science.
1. Computer history
2. Free GIS
3. Geographically Weighted Regression (GWR)
5. Open GIS Consortium
6. Oracle spatial