CSE591 Data Mining


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Dunham Pp224-226
  • Pp 234-235
  • CSE591 Data Mining

    1. 1. 6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms
    2. 2. Definitions <ul><li>Spatial data is about instances located in a physical space </li></ul><ul><li>Spatial data has location or geo-referenced features </li></ul><ul><li>Some of these features are: </li></ul><ul><ul><li>Address, latitude/longitude (explicit) </li></ul></ul><ul><ul><li>Location-based partitions in databases (implicit) </li></ul></ul>
    3. 3. Applications and Problems <ul><li>Geographic information systems (GIS) store information related to geographic locations on Earth </li></ul><ul><ul><li>Weather, community infrastructure needs, disaster management, and hazardous waste </li></ul></ul><ul><li>Homeland security issues such as prediction of unexpected events and planning of evacuation </li></ul><ul><li>Remote sensing and image classification </li></ul><ul><li>Biomedical applications include medical imaging and illness diagnosis </li></ul>
    4. 4. Use of Spatial Data <ul><li>Map overlay – merging disparate data </li></ul><ul><ul><li>Different views of the same area: (Level 1) streets, power lines, phone lines, sewer lines, (Level 2) actual elevations, building locations, and rivers </li></ul></ul><ul><li>Spatial selection – find all houses near WSU </li></ul><ul><li>Spatial join – nearest for points, intersection for areas </li></ul><ul><li>Other basic spatial operations </li></ul><ul><ul><li>Region/range query for objects intersecting a region </li></ul></ul><ul><ul><li>Nearest neighbor query for objects closest to a given place </li></ul></ul><ul><ul><li>Distance scan asking for objects within a certain radius </li></ul></ul>
    5. 5. Spatial Data Structures <ul><li>Minimum bounding rectangles (MBR) </li></ul><ul><li>Different tree structures </li></ul><ul><ul><li>Quad tree </li></ul></ul><ul><ul><li>R-Tree </li></ul></ul><ul><ul><li>kd-Tree </li></ul></ul><ul><li>Image databases </li></ul>
    6. 6. MBR <ul><li>Representing a spatial object by the smallest rectangle [(x1,y1), (x2,y2)] or rectangles </li></ul>(x1,y1) (x2,y2)
    7. 7. Tree Structures <ul><li>Quad Tree: every four quadrants in one layer forms a parent quadrant in an upper layer </li></ul><ul><ul><li>An example </li></ul></ul>
    8. 8. R-Tree <ul><li>Indexing MBRs in a tree </li></ul><ul><ul><li>An R-tree of order m has at most m entries in one node </li></ul></ul><ul><ul><li>An example (order of 3) </li></ul></ul>R8 R7 R6 R3 R2 R1 R5 R4 R8 R1 R2 R3 R6 R5 R4 R7
    9. 9. kd-Tree <ul><li>Indexing multi-dimensional data, one dimension for a level in a tree </li></ul><ul><ul><li>An example </li></ul></ul>
    10. 10. Common Tasks dealing with Spatial Data <ul><li>Data focusing </li></ul><ul><ul><li>Spatial queries </li></ul></ul><ul><ul><li>Identifying interesting parts in spatial data </li></ul></ul><ul><ul><li>Progress refinement can be applied in a tree structure </li></ul></ul><ul><li>Feature extraction </li></ul><ul><ul><li>Extracting important/relevant features for an application </li></ul></ul><ul><li>Classification or others </li></ul><ul><ul><li>Using training data to create classifiers </li></ul></ul><ul><ul><li>Many mining algorithms can be used </li></ul></ul><ul><ul><ul><li>Classification, clustering, associations </li></ul></ul></ul>
    11. 11. Spatial Mining Tasks <ul><li>Spatial classification </li></ul><ul><li>Spatial clustering </li></ul><ul><li>Spatial association rules </li></ul>
    12. 12. Spatial Classification <ul><li>Use spatial information at different (coarse/fine) levels (different indexing trees) for data focusing </li></ul><ul><li>Determine relevant spatial or non-spatial features </li></ul><ul><li>Perform normal supervised learning algorithms </li></ul><ul><ul><li>e.g., Decision trees, </li></ul></ul>
    13. 13. Spatial Clustering <ul><li>Use tree structures to index spatial data </li></ul><ul><li>DBSCAN: R-tree </li></ul><ul><li>CLIQUE: Grid or Quad tree </li></ul><ul><li>Clustering with spatial constraints (obstacles  need to adjust notion of distance) </li></ul>
    14. 14. Spatial Association Rules <ul><li>Spatial objects are of major interest, not transactions </li></ul><ul><li>A  B </li></ul><ul><ul><li>A, B can be either spatial or non-spatial (3 combinations) </li></ul></ul><ul><ul><li>What is the fourth combination? </li></ul></ul><ul><li>Association rules can be found w.r.t. the 3 types </li></ul>
    15. 15. Summary <ul><li>Spatial data can contain both spatial and non-spatial features. </li></ul><ul><li>When spatial information becomes dominant interest, spatial data mining should be applied. </li></ul><ul><li>Spatial data structures can facilitate spatial mining. </li></ul><ul><li>Standard data mining algorithms can be modified for spatial data mining, with a substantial part of preprocessing to take into account of spatial information. </li></ul>
    16. 16. Bibliography <ul><li>M. H. Dunham. Data Mining – Introductory and Advanced Topics. Prentice Hall. 2003. </li></ul><ul><li>R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification, 2 nd edition. Wiley-Interscience. </li></ul><ul><li>J. Han and M. Kamber. Data Mining – Concepts and Techniques. 2001. Morgan Kaufmann. </li></ul>