Theories and Applications of Spatial-Temporal Data Mining and Knowledge Discovery

641 views
576 views

Published on

Theories and Applications of Spatial-Temporal Data Mining and Knowledge Discovery

Yee Leung

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
641
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
59
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Theories and Applications of Spatial-Temporal Data Mining and Knowledge Discovery

  1. 1. Theories and Applications of Spatial-Temporal Data Mining and Knowledge Discovery <ul><li>Yee Leung </li></ul><ul><li>Email: [email_address] </li></ul><ul><li>Department of Geography </li></ul><ul><li>and Resource Management </li></ul><ul><li>The Chinese University of Hong Kong </li></ul>
  2. 4. a) b)
  3. 5. a) b)
  4. 9. Daily rainfall data of two stations in Pearl River basin of China
  5. 11. The monthly sunspot time series.
  6. 12. The Portuguese Stock Index PSI-20 evolution from 1993 to 2002 (adopted from J.A.O. Matos et al. / Physica A 342 (2004) 665 – 676)
  7. 13. Outbreak of Avian Flu in different regions
  8. 15. What are the structures and processes hidden in spatial data? <ul><li>What are the Concepts hidden in the information system? </li></ul><ul><li>Do the Concepts form a knowledge structure? </li></ul>
  9. 16. Typhoon Tracks Adapted from Wang and Chan
  10. 17. Typhoon/Hurricane Tracking Objective: Intensity, track (land falling, recurvature) Object: The space-time track of unusually low sea- surface air pressure in the x-y-z plane Data: potential temperature, horizontal velocity, vertical velocity, relative humidity, horizontal wind, etc Data: Hundreds and thousands of gigabytes within a specific time interval
  11. 21. Data Mining in Hyperspectral Images 1. Objective Classification, Pattern Recognition 2. Object Spectral Signatures of Objects 3. Data Spectral, Non-spectral Data 4. Data Volume e.g. : AVIRIS : from 0.4 to 2.45 micrometers, 224 bands HYDICE : from 0.4 to 2.5 micrometers, 210 bands Hyperion : from 0.4 to 2.5 micrometers, 220 bands, 30 meter resolution
  12. 22. The Objective of Knowledge Discovery and Data Mining Fayyad: The discovery of non-trivial, novel, potentially useful and interpretable knowledge/information from data Data Information Knowledge Decision
  13. 23. Characteristics of Spatial Data <ul><li>1.   Voluminous </li></ul>2.   Sparse 3.   Diversity 4.   Complex 5.   Dynamic 6.   Redundant 7.   Imperfect (random , fuzzy , granular , incomplete , noisy) 8.   Multi-scale
  14. 24. Main Tasks of Spatial Knowledge Discovery and Data Mining 1. Clustering 3. Association 2. Classification Spatial Relations Temporal Relations Spatial-temporal Relations * In particular : the local-global issue 4. Processes
  15. 25. CLUSTERING <ul><li>The Scale-Space Filtering Approach </li></ul><ul><li>The Regression-Classes Decomposition Approach </li></ul>
  16. 26. Scale Space Theory <ul><li>Given a primary image f (x) at a distance of σ from human eyes, the observed blur red image f (x, σ ) can be mathematically determined by the following partial differential equation : </li></ul>The solution of the above equation is explicitly expressed as where ‘∗’ denotes the convolution operation, g (x, σ ) is the Gaussian function
  17. 27. If the training samples are treated as an imaginary image with expression: Then the corresponding blurred image f (x, σ , D l ) at scale σ can be specified by
  18. 28. Essentials of Clustering by Scale-space Filtering <ul><li>1.   Visual system simulation </li></ul>2.   Cluster validity check 3.   Clustering validity check 4.   Relevant concepts (a) life time of a cluster (b) life time of a clustering (c) compactness (d) isolatedness
  19. 41. <ul><li>Ms-time plot of clustering results for earthquakes (Ms≥6): </li></ul><ul><li>a) 3 clusters in the 59th~95th scale range; b) 17 clusters at the 6th scale step </li></ul>a) b)
  20. 42. Temporal segmentation of Strong Earthquakes (Ms≥6.0) of 1290A.D. - 2000A.D. <ul><li>Scale-space for earthquakes (Ms≥6) </li></ul>
  21. 43. <ul><li>Indices of clustering along the time scale for earthquakes (Ms≥6.0): </li></ul><ul><li>a) number of clusters; b) Lifetime, Isolation and Compactness of the clustering </li></ul>a) b)
  22. 44. a) b) Ms-time plot of clustering results for earthquakes (Ms≥4.7): a) 2 clusters in the 74th~112th scale range; b) 18 clusters at the 10th scale step
  23. 45. Temporal Segmentation of Strong Earthquakes (Ms≥4.7) of 1484A.D. - 2000A.D. <ul><li>Indices of clustering along the time scale for earthquakes (Ms≥4.7): </li></ul><ul><li>a) number of clusters ( The vertical axis just shows the part no larger than 150 ); </li></ul><ul><li>b) Lifetime, Isolation and Compactness of the clustering </li></ul>a) b)
  24. 46. <ul><li>Table 1 Seismic active periods and episodes obtained by the clustering algorithm and the seismologists ( The number in parentheses is the number of earthquakes in the cluster ) </li></ul>
  25. 48. Advantages of Scale-space Filtering <ul><li>Free from solving global optimization problem </li></ul><ul><li>Independent of initialization </li></ul><ul><li>Robust </li></ul><ul><li>Outliers Detection </li></ul><ul><li>Generalization of scale-related algorithms </li></ul><ul><li>Consistent with visual system </li></ul>
  26. 49. 5. Scale Space Clustering Scale-Space Filtering for Simulated Data
  27. 50. 5. Scale Space Clustering Scale-Space Filtering for Remote-Sensing Data Clustering Tree Quasi-Light
  28. 51. Clustering by Regression-Classes Decomposition Method
  29. 52. Simple Gaussian Class
  30. 53. Linear Structure
  31. 54. Identification of line objects in remotely sensed data
  32. 55. Ellipsoidal Structure
  33. 57. Two ellipsoidal feature extraction
  34. 58. General Curvilinear Structure
  35. 59. Complex Shape Structure
  36. 60. ANALYSIS OF SPATIAL RELATIONSHIP <ul><li>Global Description </li></ul><ul><ul><li>Moran’s I </li></ul></ul><ul><ul><li>Geary’s c </li></ul></ul><ul><ul><li>OLR </li></ul></ul><ul><li>Local Description </li></ul><ul><ul><li>Local Moran’s I </li></ul></ul><ul><ul><li>Local Geary’s c </li></ul></ul><ul><ul><li>G Statistic </li></ul></ul><ul><ul><li>Geographically Weighted Regression </li></ul></ul><ul><ul><li>Mixture Distribution </li></ul></ul>
  37. 61. Geographically Weighted Regression Hypothesis testing 1. Ho: No difference between OLR and GWR 2. Ho: a 1k = a 2k = … = a nk
  38. 64. (Regression-Classes Decomposition Method)
  39. 65. CLASSIFICATION <ul><li>The Neural Network Approach </li></ul><ul><li>The Classification and Regression Tree </li></ul><ul><li>The Statistical Classifiers </li></ul>
  40. 66. Information Extraction and Classification Neural Networks for Classification--MLP-BP
  41. 67. Some Typical Feedforward Neural Networks <ul><li>Perception </li></ul><ul><ul><li>In late 1950s: layered feed forward networks named perceptron. </li></ul></ul><ul><ul><li>Today: Perceptron : single-layer, feed-forward networks. </li></ul></ul><ul><ul><li>See Fig. 8, each multi-output unit O is fed independently from the input units. </li></ul></ul>Figure 8. Perceptrons
  42. 68. <ul><li>Mulitlayer Feedforward Neural Network </li></ul><ul><ul><li>Learning algorithms for multilayer networks are neither efficient nor guaranteed to converge to global optimum . </li></ul></ul><ul><ul><li>Most popular learning method: back-propagation . </li></ul></ul><ul><li>Back-propagation learning </li></ul><ul><ul><li>The restaurant problem: use a 2-layer network, 10 attributes = 10 input units, 4 hidden units. See Fig. 13. </li></ul></ul>Some Typical Feedforward Neural Networks (con ’ t) Fig. 13. A 2-layer feedforward network for the restaurant problem.
  43. 76. <ul><li>Competitive Pattern Recognition by Recurrent NN </li></ul>
  44. 78. Typhoon Tracks Adapted from Wang and Chan
  45. 79. Trees by Classification and Regression Tree (CART) MSW 6/12/18: Maximum Sustained Wind of TC 6/12/18 hours before recurvature. 0: Recurve,1: Straight
  46. 80. <ul><li>1. If MSW of a TC is smaller than or equal to 34 m/s and MSW of that TC is smaller than 2 m/s 6 hours later, then the TC will recurve in 12 hours with 96% accuracy. </li></ul><ul><li>2. If MSW of a TC is smaller than or equal to 34 m/s and MSW of that TC is larger than 2 m/s 6 hours later, the TC will move straight in 12 hours with 86.8% accuracy. </li></ul><ul><li>3. If MSW of a TC is larger than 34 m/s, it will recurve in 18 hours with 94.1% accuracy. </li></ul>Rules by CART
  47. 81. DISCOVERY OF TEMPORAL PROCESSES <ul><li>The Multifractal Approach </li></ul><ul><li>Conventional Time Series Analyses </li></ul>
  48. 82. <ul><li>Mining of Scaling Behavior by Multifractal Analysis </li></ul>
  49. 83. Multiplicative Cascade <ul><li>An approach to the study of scaling behavior with multiple scales (granules). </li></ul><ul><li>Multiplicative Binomial Cascade </li></ul>
  50. 84. Schematic representation of cascade (adopted from Puente and Lopez, 1995, Physical Letters A)
  51. 86. TEMPORAL ANALYSIS <ul><li>Linear Time Invariant System  Self Similarity  Multiscaling  Infinitely Divisible Cascade </li></ul><ul><li>Stationary Process </li></ul><ul><li>Non-stationary Process </li></ul><ul><ul><li>Random Walk </li></ul></ul><ul><ul><li>Fractional Brownian Motion, fmb </li></ul></ul><ul><ul><li>Multifractal Analysis of Stochastic Trends </li></ul></ul>
  52. 87. The Multifractal Approach <ul><li>Establish a data model for stochastic time series </li></ul><ul><li>Discovery of relevant models in stochastic time series </li></ul>
  53. 88. MF-DFA <ul><li>Detrended fluctuation analysis (DFA) is a method for detecting the long-range correlation and fractal property in the both stationary and non-stationary time series. MF-DFA , which is based on DFA, can give full description of more complicated scaling behavior of time series </li></ul>
  54. 89. MF-DFA <ul><li>Given a time series with length N. </li></ul><ul><li>Step1: i=1,2,…,N; </li></ul><ul><li>Step2: Divide Y(i) into non-overlapping segments of equal lengths s. In order not to disregard this part of the series, the same procedure is repeated starting from the opposite end. Thereby, 2 Ns segments are obtained altogether. </li></ul>
  55. 90. MF-DFA <ul><li>Step 3 . Calculate the local trend for each of the 2 N s segments by a least squares fit of the series. Then determine the variance </li></ul><ul><li>for each segment ν , ν = 1 , . . .,N s , and </li></ul><ul><li>for ν = N s + 1 , . . . , 2 N s . Here, is the fitting polynomial in segment ν, whose order m can be 1, 2, 3 … . </li></ul><ul><li>Step 4 . Average over all segments to obtain the q th-order fluctuation function, defined </li></ul><ul><li>Where , s ≥ m + 2. </li></ul>
  56. 91. MF-DFA <ul><li>Step5: Determine the scaling behavior of the fluctuation function by analyzing log-log plots of Fq(s) versus s for each value of q . If we have , for large values of s, we get the exponent h(q), which may depend on q generally. </li></ul><ul><ul><ul><li>H=h(q=2), for stationary time series; </li></ul></ul></ul><ul><ul><ul><li>H=h(q=2)-1, for non-stationary time series. </li></ul></ul></ul>
  57. 92. MF-DFA <ul><li>For MF-DFA, if h(q) is constant for all q , the corresponding time series is mono-fractal. However, if h(q) varies with q, that means multifractal. </li></ul>
  58. 93. <ul><li>( adopted from Peng et. al., 1994 ) </li></ul>
  59. 102. Daily rainfall data of two stations in Pearl River basin of China
  60. 103. Log-log plots of F q (s) versus s for the daily rainfall time series of station 56691 in Pearl River basin (left) and Station Chuantang in East River basin (right) with q =2.
  61. 104. The h ( q ) curves of daily rainfall time series of stations in the Pearl River basin (left) and stations in the East River basin (right).
  62. 105. The curves of daily rainfall time series of stations in the Pearl River basin (left) and stations in the East River basin (right).
  63. 106. The curves of daily rainfall time series of stations in the Pearl River basin (left) and stations in the East River basin (right)
  64. 107. The curves of daily rainfall time series of 5 stations in the Pearl River basin
  65. 108. The curves of daily rainfall time series of stations in the Pearl River basin (left) and stations in the East River basin (right).
  66. 109. The curves of daily rainfall time series of stations in the Pearl River basin (left) and stations in the East River basin (right). The real lines are their cascade model fitting.
  67. 110. The correlation relationship between the altitude of the rainfall stations in the East River basin and the D (2) value of the rainfall time series.
  68. 111. Elevation of rainfall stations in the East River basin with the D2 values of their rainfall data. Elevation (m above MSL)
  69. 112. DISCOVERY OF KNOWLEDGE STRUCTURES <ul><li>The Concept Lattice Approach </li></ul>
  70. 113. <ul><li>Discovery of Hierarchical Knowledge from Relational Spatial Data </li></ul>
  71. 115. Spatial Concept/Class and Data Encapsulation
  72. 116. Concept Hierarchy
  73. 117. Inheritance
  74. 118. Generalization and Specialization
  75. 119. Summary <ul><li>(1)Concept lattice as a mathematical foundation for object-oriented spatial information system </li></ul><ul><li>(2)Concpet lattice can be employed as method to unravel hierarchical structure from spatial information system </li></ul><ul><li>(3)A bridge between relational spatial information system (vector-based, raster-based) and object-oriented spatial information system. </li></ul>
  76. 120. Yee Leung. Knowledge Discovery in Spatial Data. Berlin: Springer-Verlag, 2010. [email_address] IGU-Commission on Modeling Geographical Systems http://www.science.mcmaster.ca/~igu~cmgs/

×