Transcript of "FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING"
1.
FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING
A. Boulmakoul*, K. Zeitouni** — N. Chelghoum**
*LIST - Laboratoire Informatique des Systèmes de Transport,
Faculté des Sciences et Techniques de Mohammedia,
B.P. 146 Mohammedia, Maroc. {boul@uh2m.ac.ma}
**Laboratoire PRISM - Université de Versailles Saint Quentin,
45, Avenue des Etats-Unis, F-78035 Versailles Cedex, France.
spatial data bases, was integrated in database
Abstract Spatial data mining knows a more and more
important interest. Fundamental processes of spatial data
management systems (DBMS) [7]. This approach
mining are in particular clustering and structural patterns permits using efficient algorithms of data mining.
detection. These processes are influenced strongly by the From neighborhood relations some operators in DBMS
concept of proximity or neighborhood. This paper introduces context, have been proposed to facilitate the expression
some structures to the construction of a spatial data mining of spatial data mining algorithms [8-9]. The objective
integrating fuzzy structural primitives and proposes its of this paper is to incorporate some supplementary
application within a system for safety road analysis. We techniques based on the fuzzy structural analysis of the
propose also a fuzzy general algorithm permitting to complex systems [5] [12] [14].
determine partitions for a fuzzy reflexive and symmetrical
relation. These investigating are important for the data In this article we propose a simple general algorithm
analysis and the spatial data mining. The system that permits to generate partitions for all reflexive and
implementation uses in particular, the C++/STL, the symmetrical fuzzy relations. In particular for those that
Microsoft Foundation Class Library (MFC) and MapObjects
are transitive max-min or transitive max- . This
ActiveX control ( ESRI). The system components
algorithm is advantageous in relation to the one
architecture is also described in this work.
proposed by Yang [15], for its general formulation and
Index terms— GIS, spatial data mining, fuzzy clustering, the simplicity of its implementation.
similarity, accidents analysis, MapObjects.
A software component dedicated to the management
and the manipulation of fuzzy relations has been
1. INTRODUCTION developed and built-in in a global system of spatial
data mining.
The main objective of the spatial data mining is to The spatial clustering is a process of the grouping
discover hidden complex knowledge from spatial and objects in classes. Several techniques have been
not spatial data despite of their huge amount and the developed, they distinguish themselves according to
complexity of spatial relationships computing. the following typology: partitioning methods,
However, the spatial data mining methods are still an hierarchical methods, density based methods [13] and
extension of those used in conventional data mining. grid based methods [10] [18]. In this context we
Spatial Data Mining (SDM) consists in two functions propose a fuzzy clustering model based on the fuzzy
[3] [17-18]. The first describes a spatial phenomenon graphs. These graphs are constructed from fuzzy
by exploring data, for example to identify risky zones relations between objects with using spatial relations
by viewing the spatial distribution of the accidents [4]. In this approach, properties and operations of the
location. The second function explains or even predicts fuzzy relations of similarity are solicited (the
the phenomena while looking for some hierarchical analysis, and the convex fuzzy relation
correspondences with properties of the geographical decomposition).
environment. For instance, accidents could be The work that we describe in this paper targets the
“ explained “ by the state of the road or by the urban spatial data mining for the road accidents analysis. The
density around. The spatial classification belongs to traffic risk analysis allows identifying the road safety
these explanatory methods. problem in order to propose safety measures. This
Algorithms of spatial data mining are bound strongly project aims at deducing relevant risk models to help in
to the concept of neighborhood relations. The traffic safety task. The risk assessment is based on the
neighborhood relations as it has been defined in the information on the previous injury accidents collected
relative recent works on the knowledge discovery in by police forces. However, right now, this analysis has
3.
- T µ R x, y , µ R y , z min µ R x, y , µ R y, z then R
Da ta P ro vi d e r
O m ega : Ac c ide nt
is said transitive max-min,
M ap
Ad dAc c ide nts La y er() - T µ R x, y , µ R y , z max 0, µ R x, y µ R y, z 1
O nM apSa v eSha pes F ro m D B()
then R is said transitive max- ,
O nM apSa v eSha pes F ro m File ()
- T µ R x, y , µ R y , z µ R x, y µ R y, z , then R is
said transitive max-prod.
If R is a fuzzy relation, its convex decomposition is
given by R max aR a , where R a is the -cut of
Fu z z z y C l u s t e r A l go
the relation R, [0,1]. If R is a transitive max-min
1..* similarity relation, then R a is an equivalence relation.
A cci d e n t The proposed algorithm bellow, is used for finding
partitions of similarity relations constructed from road
accidents data. The construction of similarity relations
makes reference to the spatial data.
Figure 4. Fuzzy clustering Class diagram.
3.1. General algorithm for partition finding
The software components deducted at this stage are
protected in packages that correspond to the detected
domains (figure 5). Notation :
- : the set of objects to classify.
IMS
Data mining
- R, : the function indicator of a reflexive and
Spatial symmetrical fuzzy relation defined on ..
<<Generic Package>>
STL - :the list of obtained clusters, initially empty.
- e, µ R e, x is the similarity function
<<ActiveX>> <<ActiveX>> x
WebLink MapObject
between object and set.
<<Generic Package>>
Let the fololowing functions defined as :
Fuzzy Primitives
e, µ R e, x , e , x, 0 ;
<<Generic Package>> x
Accident Data
Provider
- R min R, R a , where R a is the -cut of the
ˆ
relation R, [0,1]. The main procedures of the
DBMS
algorithm are given below.
Cluster_finding( R, , , )
{ R min R, R a ; ?
ˆ max µ ˆ ?, x
x O R
Figure 5. SDM Components view.
- T is an STL stack container, in which elements of
3. FUZZY RELATIONS AND CLUSTERING to classify are sorted out, according to the decreasing
values of .
A fuzzy similarity relation [1-2] [6] [8-9], is a T heap_sort( ,compare( ));
generalization of the notion of equivalence relation in
the classic setting. While ( !T.empty())
{ e T.top()
Let be a set of objects, R a fuzzy relation on . R is if ( .empty()
a similarity relation if its verifies the following then CreateCluster(e,T, )
properties [16] : else
- (reflexivity) R(x,x)=1 x ; { , calculate e, , e, , e
2 if ( ! (e).empty())
- (symmetry) R(x,y)= R(y,x)= (x,y) ;
then AttractionCluster(e, );
- (max-T transitivity) else CreationCluster(e,T, );
R(y,x) µ R x, z max T µ R x, y , µ R y , z (x,z) }
y
}
2
; where T is a T-norm.
4.
The procedure CreationCluster permits to create a new
partition and to suppress from the stack T all classified
objects. REFERENCES
CreateCluster(e,T, ) 1. Backer E., Cluster analysis by optimal decomposition of
{ C = new cluster; induced fuzzy sets, PhD Thesis, Delftse University,
Ge y T , y e / µ R y, e max µ R e, x ; 1978.
x O
2. Boulmakoul A., Structure Prétopologique Matroïdale :
y* (e).top(); C insert(e),y}; C insert(y); Application à la Décomposition des Systèmes
insert(C);T.delete(e); T.delete(y*); Complexes, Conférence Internationale de
} Mathématiques Appliquées et Sciences de l’Ingénieur,
Tome I, pp. 277-281, Casablanca, ENSEM, 14-19 Nov.
The procedure attractionCluster given bellow, affects 1996.
objects to "the most similar" existing clusters. Its 3. Boulmakoul A., Zeitouni, K. Primitives structurales
suppresses from the stack T all classified objects. pour le data mining spatial, in Int. AMSE Conf. , vol 1,
62-69, March 19-21, 2001, Rabat Morocco.
AttractionCluster (e, )
4. Cohn A.G., Randel D.A., Cui Z., Taxonomies of
{ Let the cluster C* such that s e, C* max s e, ; logically defined qualitative spatial relations, Int.
? e
Journal of Human-Computer Studies, 43(1995), 831-
C* C* {e}; T.delete(e); 846.
}
5. Dussauchoy A., Paths algebra, similarities and system
The general algorithm assures the extraction of decomposition, Journal of Math. Analysis and
partitions for all similarity relation transitive max-min Applications Vol. 102, N° 1, 75-85.
or transitive max- . 6. Emptoz H., Modèle prétopologique pour la
reconnaissance des formes, Thèse d’état 1983,
4. CONCLUSION & PERSPECTIVES Université Claude Bernard Lyon, France.
7. Ester M., Kriegel H-P., and Sander J., Spatial data
In this work we proposed a general algorithm of mining : a database approach, Proc. in Lecture Notes in
partition finding for a reflexive and symmetrical fuzzy Computer Science, 1997, Vol. 1262, Springer, pp. 47-
relation. The algorithm assures the determination of 66.
partitions for all cases of a fuzzy relation transitivity 8. Kim L., Fuzzy relation compositions and pattern
(max-min, max- , etc.). in any case the algorithm can recognition, Information Sciences 89, 107-130 (1996),
be applied directly on a reflexive, symmetrical and non Elsevier.
transitive fuzzy relation. The design of a spatial data 9. Murali V., Fuzzy equivalence relations, Fuzzy Sets and
mining is also described in this work. In this system we Systems 30 (1989) 155-163.
are going to integrate the software component of fuzzy 10. Nanopoulos A., Manolopoulos Y., Mining patterns from
relations manipulation, for accidents spatial data graph traversals, Data & Knowledge Engineering 37
mining. (2001) 243-266, Elsevier.
11. Okada S., Soper T., A shortest path problem on a
The fuzzy structural primitives will be able to bring a
network with fuzzy arc lenghts, Fuzzy Sets and Systems
new approach to fear the accidents analysis in the 109 (2000) 129-140.
setting of this work. On the basis of fuzzy graphs and
12. Tamura S., Higuchi S., Tanaka K., Pattern classification
the clustering algorithm it is possible to reach the based on fuzzy relations, IEEE Trans. Syst. Man
following objectives : Cybernet. 1 (1978) 61-66.
1. to define a formal model of the urban transportation 13. Tremolières R., The percolation method for an efficient
network founded on the notion of fuzzy neighborhood grouping of data, Pattern recognition, vol. 11, n°4,
graphs, this by the definition of a neighborhood graph 1979.
by the concept of fuzzy relation, 14. Yager A., On general class of fuzzy connectives, Fuzzy
Sets and Systems 4 (1980) 235-242.
2. to build a risk map for the urban network using
15. Yang M., Shih H., Cluster analysis based on fuzzy
fuzzy paths [11], relations, Fussy Sets and Systems 120 (2001) 197-212.
3. to lead the itinerary risk approach analyzes to detect 16. Zadeh L., Similarity relations and fuzzy ordering, Infor.
the dangerous paths, Sci. 3 (1971) 177-200.
17. Zeitouni K., Chelghoum N., Boulmakoul A., Arbre de
4. to project results gotten in exportable dynamic map
décision spatial multi-thèmes, in SFC’01, 17-21
on the WEB, Décembre 2001, Pointe-à-Pitre Guadeloupe.
5. to integrate and to formulate information about “risk 18. Zeitouni K., Yeh L., Le data mining spatial et les bases
accidents” according to specified point of view. de données spatiales, in revue Int. de géomatique, Vol.
9- n°4/1999, pp. 389-423.
5.
This document was created with Win2PDF available at http://www.daneprairie.com.
The unregistered version of Win2PDF is for evaluation or non-commercial use only.
Be the first to comment