FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING

A. Boulmakoul*, K. Zeitouni** — N. Chelghoum**

*LIST - Laboratoire I...
been based on statistic with no consideration of spatial    1- a domain bound to the technical components of the
relations...
- T µ R x, y , µ R y , z min µ R x, y , µ R y, z        then R
                    Da ta P ro vi d e r
        O m ega : A...
The procedure CreationCluster permits to create a new
partition and to suppress from the stack T all classified
objects.  ...
This document was created with Win2PDF available at http://www.daneprairie.com.
The unregistered version of Win2PDF is for...
Upcoming SlideShare
Loading in...5
×

FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING

470

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
470
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING"

  1. 1. FUZZY STRUCTURAL PRIMITIVES FOR SPATIAL DATA MINING A. Boulmakoul*, K. Zeitouni** — N. Chelghoum** *LIST - Laboratoire Informatique des Systèmes de Transport, Faculté des Sciences et Techniques de Mohammedia, B.P. 146 Mohammedia, Maroc. {boul@uh2m.ac.ma} **Laboratoire PRISM - Université de Versailles Saint Quentin, 45, Avenue des Etats-Unis, F-78035 Versailles Cedex, France. spatial data bases, was integrated in database Abstract Spatial data mining knows a more and more important interest. Fundamental processes of spatial data management systems (DBMS) [7]. This approach mining are in particular clustering and structural patterns permits using efficient algorithms of data mining. detection. These processes are influenced strongly by the From neighborhood relations some operators in DBMS concept of proximity or neighborhood. This paper introduces context, have been proposed to facilitate the expression some structures to the construction of a spatial data mining of spatial data mining algorithms [8-9]. The objective integrating fuzzy structural primitives and proposes its of this paper is to incorporate some supplementary application within a system for safety road analysis. We techniques based on the fuzzy structural analysis of the propose also a fuzzy general algorithm permitting to complex systems [5] [12] [14]. determine partitions for a fuzzy reflexive and symmetrical relation. These investigating are important for the data In this article we propose a simple general algorithm analysis and the spatial data mining. The system that permits to generate partitions for all reflexive and implementation uses in particular, the C++/STL, the symmetrical fuzzy relations. In particular for those that Microsoft Foundation Class Library (MFC) and MapObjects are transitive max-min or transitive max- . This ActiveX control ( ESRI). The system components algorithm is advantageous in relation to the one architecture is also described in this work. proposed by Yang [15], for its general formulation and Index terms— GIS, spatial data mining, fuzzy clustering, the simplicity of its implementation. similarity, accidents analysis, MapObjects. A software component dedicated to the management and the manipulation of fuzzy relations has been 1. INTRODUCTION developed and built-in in a global system of spatial data mining. The main objective of the spatial data mining is to The spatial clustering is a process of the grouping discover hidden complex knowledge from spatial and objects in classes. Several techniques have been not spatial data despite of their huge amount and the developed, they distinguish themselves according to complexity of spatial relationships computing. the following typology: partitioning methods, However, the spatial data mining methods are still an hierarchical methods, density based methods [13] and extension of those used in conventional data mining. grid based methods [10] [18]. In this context we Spatial Data Mining (SDM) consists in two functions propose a fuzzy clustering model based on the fuzzy [3] [17-18]. The first describes a spatial phenomenon graphs. These graphs are constructed from fuzzy by exploring data, for example to identify risky zones relations between objects with using spatial relations by viewing the spatial distribution of the accidents [4]. In this approach, properties and operations of the location. The second function explains or even predicts fuzzy relations of similarity are solicited (the the phenomena while looking for some hierarchical analysis, and the convex fuzzy relation correspondences with properties of the geographical decomposition). environment. For instance, accidents could be The work that we describe in this paper targets the “ explained “ by the state of the road or by the urban spatial data mining for the road accidents analysis. The density around. The spatial classification belongs to traffic risk analysis allows identifying the road safety these explanatory methods. problem in order to propose safety measures. This Algorithms of spatial data mining are bound strongly project aims at deducing relevant risk models to help in to the concept of neighborhood relations. The traffic safety task. The risk assessment is based on the neighborhood relations as it has been defined in the information on the previous injury accidents collected relative recent works on the knowledge discovery in by police forces. However, right now, this analysis has
  2. 2. been based on statistic with no consideration of spatial 1- a domain bound to the technical components of the relationships. This work aims at identifying risky road SDM (data minig algorithms, queries operator, etc.) sections and analyzing and explaining those risks with 2- a domain concerning the semantic persistent data respect to the geographic context. We propose to (DBMS), combine accident data to thematic data relating to the road network, the population census, the buildings, or 3- a domain attached to the spatial persistent data any other geographic neighborhood information in the (GIS), process of risk analysis. This is specifically the approach of spatial data mining technology. This paper The spatial view uses MapObjects (© ESRI) within STL and MFC library. MapObjects software is a set of focuses in the clustering task. The analysis builds a spatial partitions that integrates the spatial feature of mapping software components that provides dynamic the entry thematic layer (here the accidents). This mapping and geographic information system stands for considering the interaction with other capabilities to Windows applications or to build thematic layers in the decision rule induction. custom mapping and GIS solutions. Meanwhile, in the application domain, one can explain MapObjects comprises an ActiveX control called the and predict the danger of the roads by their Map control and a set of 46 ActiveX automation geographical context. objects. In our work, MapObjects is used in Visual C++ programming environment. The figure 1 shows The present article is structured as follows: the section 2, gives the conception aspects of the spatial data the interface of the Spatial Data Mining application for mining system and shows the integration of the accidents analysis. The application is developed under windows (MFC/MapObjects) and allows the user to software component dedicated to the fuzzy relations. In the section 3 we recall summarily some operations on express requests for spatial accidents analysis on the similarity relations and we develop the general Mohammedia urban transport network. algorithm permitting the extraction of partitions of a Figure 2 shows collaborations between the different similarity relation. The last section offers the classes. In this use case view, only the fundamental conclusion of this work and summarize the results of interactions contributing to the processing of the fuzzy the present developments and trace the future stages of clustering algorithm, the data exchange and their this research project. visualization are taken in account. Figure 4 gives a simplified class diagram for this use case view. <<business actor>> Fuzzy Cluster CView Clustering Accident Clusters exchange Data preparation <<Actor>> Display DataProvider Create Accident Layer <<Actor>> CMap Accident DB Figure 1. Spatial accidents analysis in Mohammedia city. Figure 2. Fuzzy clustering use case view. 2. SPATIAL DATA MINING SOFTWARE COMPONENTS ANALYSIS & DESIGN G IS Spatial D ata Clustering In this section elements of analysis and design of a Mi ning spatial data mining system are studied. The intended system has for objective the analysis of the spatial data accident occurred on the urban roads. This system includes several primitives of data exploration. The fuzzy relations manipulation in a general setting is also D BM S G eo Statistics Structural integrated in this system. prim itiv es The analysis of the spatial data mining system gives three important informational domains (figure 3) : Figure 3. Main domains.
  3. 3. - T µ R x, y , µ R y , z min µ R x, y , µ R y, z then R Da ta P ro vi d e r O m ega : Ac c ide nt is said transitive max-min, M ap Ad dAc c ide nts La y er() - T µ R x, y , µ R y , z max 0, µ R x, y µ R y, z 1 O nM apSa v eSha pes F ro m D B() then R is said transitive max- , O nM apSa v eSha pes F ro m File () - T µ R x, y , µ R y , z µ R x, y µ R y, z , then R is said transitive max-prod. If R is a fuzzy relation, its convex decomposition is given by R max aR a , where R a is the -cut of Fu z z z y C l u s t e r A l go the relation R, [0,1]. If R is a transitive max-min 1..* similarity relation, then R a is an equivalence relation. A cci d e n t The proposed algorithm bellow, is used for finding partitions of similarity relations constructed from road accidents data. The construction of similarity relations makes reference to the spatial data. Figure 4. Fuzzy clustering Class diagram. 3.1. General algorithm for partition finding The software components deducted at this stage are protected in packages that correspond to the detected domains (figure 5). Notation : - : the set of objects to classify. IMS Data mining - R, : the function indicator of a reflexive and Spatial symmetrical fuzzy relation defined on .. <<Generic Package>> STL - :the list of obtained clusters, initially empty. - e, µ R e, x is the similarity function <<ActiveX>> <<ActiveX>> x WebLink MapObject between object and set. <<Generic Package>> Let the fololowing functions defined as : Fuzzy Primitives e, µ R e, x , e , x, 0 ; <<Generic Package>> x Accident Data Provider - R min R, R a , where R a is the -cut of the ˆ relation R, [0,1]. The main procedures of the DBMS algorithm are given below. Cluster_finding( R, , , ) { R min R, R a ; ? ˆ max µ ˆ ?, x x O R Figure 5. SDM Components view. - T is an STL stack container, in which elements of 3. FUZZY RELATIONS AND CLUSTERING to classify are sorted out, according to the decreasing values of . A fuzzy similarity relation [1-2] [6] [8-9], is a T heap_sort( ,compare( )); generalization of the notion of equivalence relation in the classic setting. While ( !T.empty()) { e T.top() Let be a set of objects, R a fuzzy relation on . R is if ( .empty() a similarity relation if its verifies the following then CreateCluster(e,T, ) properties [16] : else - (reflexivity) R(x,x)=1 x ; { , calculate e, , e, , e 2 if ( ! (e).empty()) - (symmetry) R(x,y)= R(y,x)= (x,y) ; then AttractionCluster(e, ); - (max-T transitivity) else CreationCluster(e,T, ); R(y,x) µ R x, z max T µ R x, y , µ R y , z (x,z) } y } 2 ; where T is a T-norm.
  4. 4. The procedure CreationCluster permits to create a new partition and to suppress from the stack T all classified objects. REFERENCES CreateCluster(e,T, ) 1. Backer E., Cluster analysis by optimal decomposition of { C = new cluster; induced fuzzy sets, PhD Thesis, Delftse University, Ge y T , y e / µ R y, e max µ R e, x ; 1978. x O 2. Boulmakoul A., Structure Prétopologique Matroïdale : y* (e).top(); C insert(e),y}; C insert(y); Application à la Décomposition des Systèmes insert(C);T.delete(e); T.delete(y*); Complexes, Conférence Internationale de } Mathématiques Appliquées et Sciences de l’Ingénieur, Tome I, pp. 277-281, Casablanca, ENSEM, 14-19 Nov. The procedure attractionCluster given bellow, affects 1996. objects to "the most similar" existing clusters. Its 3. Boulmakoul A., Zeitouni, K. Primitives structurales suppresses from the stack T all classified objects. pour le data mining spatial, in Int. AMSE Conf. , vol 1, 62-69, March 19-21, 2001, Rabat Morocco. AttractionCluster (e, ) 4. Cohn A.G., Randel D.A., Cui Z., Taxonomies of { Let the cluster C* such that s e, C* max s e, ; logically defined qualitative spatial relations, Int. ? e Journal of Human-Computer Studies, 43(1995), 831- C* C* {e}; T.delete(e); 846. } 5. Dussauchoy A., Paths algebra, similarities and system The general algorithm assures the extraction of decomposition, Journal of Math. Analysis and partitions for all similarity relation transitive max-min Applications Vol. 102, N° 1, 75-85. or transitive max- . 6. Emptoz H., Modèle prétopologique pour la reconnaissance des formes, Thèse d’état 1983, 4. CONCLUSION & PERSPECTIVES Université Claude Bernard Lyon, France. 7. Ester M., Kriegel H-P., and Sander J., Spatial data In this work we proposed a general algorithm of mining : a database approach, Proc. in Lecture Notes in partition finding for a reflexive and symmetrical fuzzy Computer Science, 1997, Vol. 1262, Springer, pp. 47- relation. The algorithm assures the determination of 66. partitions for all cases of a fuzzy relation transitivity 8. Kim L., Fuzzy relation compositions and pattern (max-min, max- , etc.). in any case the algorithm can recognition, Information Sciences 89, 107-130 (1996), be applied directly on a reflexive, symmetrical and non Elsevier. transitive fuzzy relation. The design of a spatial data 9. Murali V., Fuzzy equivalence relations, Fuzzy Sets and mining is also described in this work. In this system we Systems 30 (1989) 155-163. are going to integrate the software component of fuzzy 10. Nanopoulos A., Manolopoulos Y., Mining patterns from relations manipulation, for accidents spatial data graph traversals, Data & Knowledge Engineering 37 mining. (2001) 243-266, Elsevier. 11. Okada S., Soper T., A shortest path problem on a The fuzzy structural primitives will be able to bring a network with fuzzy arc lenghts, Fuzzy Sets and Systems new approach to fear the accidents analysis in the 109 (2000) 129-140. setting of this work. On the basis of fuzzy graphs and 12. Tamura S., Higuchi S., Tanaka K., Pattern classification the clustering algorithm it is possible to reach the based on fuzzy relations, IEEE Trans. Syst. Man following objectives : Cybernet. 1 (1978) 61-66. 1. to define a formal model of the urban transportation 13. Tremolières R., The percolation method for an efficient network founded on the notion of fuzzy neighborhood grouping of data, Pattern recognition, vol. 11, n°4, graphs, this by the definition of a neighborhood graph 1979. by the concept of fuzzy relation, 14. Yager A., On general class of fuzzy connectives, Fuzzy Sets and Systems 4 (1980) 235-242. 2. to build a risk map for the urban network using 15. Yang M., Shih H., Cluster analysis based on fuzzy fuzzy paths [11], relations, Fussy Sets and Systems 120 (2001) 197-212. 3. to lead the itinerary risk approach analyzes to detect 16. Zadeh L., Similarity relations and fuzzy ordering, Infor. the dangerous paths, Sci. 3 (1971) 177-200. 17. Zeitouni K., Chelghoum N., Boulmakoul A., Arbre de 4. to project results gotten in exportable dynamic map décision spatial multi-thèmes, in SFC’01, 17-21 on the WEB, Décembre 2001, Pointe-à-Pitre Guadeloupe. 5. to integrate and to formulate information about “risk 18. Zeitouni K., Yeh L., Le data mining spatial et les bases accidents” according to specified point of view. de données spatiales, in revue Int. de géomatique, Vol. 9- n°4/1999, pp. 389-423.
  5. 5. This document was created with Win2PDF available at http://www.daneprairie.com. The unregistered version of Win2PDF is for evaluation or non-commercial use only.

×