A Journey through the
Spatial Data Mining and
Geographic Knowledge
Discovery Jungle
Dr. Kam Tin Seong PhD
Associate Professor of Information Systems (Practice)
School of Information Systems
Singapore Management University
E-mail: tskam@smu.edu.sg




                              Copyright © 2011, SAS Institute Inc. All rights reserved.
Content

 Motivations
 Interactive exploratory analysis
 Distribution analysis
 Geographic data visualisation
 Visualising and detecting spatio-temporal patterns




                                                                                 2



                     Copyright © 2011, SAS Institute Inc. All rights reserved.
Motivations

 Availability of massive, high dimensional, and complex
  geospatially-referenced data
 General lack of spatial data visualisation and analysis
  functions in data analysis software
 General lack of data analytics techniques in
  conventional GIS
 There is an urgent need for effective and efficient
  methods to visualise and detect unknown and
  unexpected information from these massive datasets




                                                                                 3



                     Copyright © 2011, SAS Institute Inc. All rights reserved.
Taxi Travel Log Case Study

 One day taxi travel log – 278676 trips
 Number of variables: 53




                                                                                 4



                     Copyright © 2011, SAS Institute Inc. All rights reserved.
Initial Data exploration: Univariate

 Overview of the data
 Detect outliers
 Missing data
 Identify new variables




                                                                                5



                    Copyright © 2011, SAS Institute Inc. All rights reserved.
Initial Data exploration: Bivariate




                                                                             6



                 Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Cleaning and Transformation

 Data cleaning
 Derive new variables: Time interval, travel time etc




                                                                                 7



                     Copyright © 2011, SAS Institute Inc. All rights reserved.
Geographic Data Visualisation




                                                                           8



               Copyright © 2011, SAS Institute Inc. All rights reserved.
Visualising and Detecting Spatio-temporal
Patterns with Interactive Brushing




                                                                           9



               Copyright © 2011, SAS Institute Inc. All rights reserved.
Visualising and Detecting Spatio-temporal
Patterns with Animated Map




                                                                           10



               Copyright © 2011, SAS Institute Inc. All rights reserved.
Visualising and Detecting Spatio-Temporal
Patterns with Trellis Maps




                                                                           11



               Copyright © 2011, SAS Institute Inc. All rights reserved.
Visualising and Detecting Spatio-Temporal
Point Patterns




                                                                           12



               Copyright © 2011, SAS Institute Inc. All rights reserved.
Visualising and Detecting Spatio-Temporal
Point Patterns with Density Map




                                                                           13



               Copyright © 2011, SAS Institute Inc. All rights reserved.
Q&A




Copyright © 2011, SAS Institute Inc. All rights reserved.

SAS business analytics forum 2011 kam

  • 1.
    A Journey throughthe Spatial Data Mining and Geographic Knowledge Discovery Jungle Dr. Kam Tin Seong PhD Associate Professor of Information Systems (Practice) School of Information Systems Singapore Management University E-mail: tskam@smu.edu.sg Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 2.
    Content  Motivations  Interactiveexploratory analysis  Distribution analysis  Geographic data visualisation  Visualising and detecting spatio-temporal patterns 2 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 3.
    Motivations  Availability ofmassive, high dimensional, and complex geospatially-referenced data  General lack of spatial data visualisation and analysis functions in data analysis software  General lack of data analytics techniques in conventional GIS  There is an urgent need for effective and efficient methods to visualise and detect unknown and unexpected information from these massive datasets 3 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 4.
    Taxi Travel LogCase Study  One day taxi travel log – 278676 trips  Number of variables: 53 4 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 5.
    Initial Data exploration:Univariate  Overview of the data  Detect outliers  Missing data  Identify new variables 5 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 6.
    Initial Data exploration:Bivariate 6 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 7.
    Data Cleaning andTransformation  Data cleaning  Derive new variables: Time interval, travel time etc 7 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 8.
    Geographic Data Visualisation 8 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 9.
    Visualising and DetectingSpatio-temporal Patterns with Interactive Brushing 9 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 10.
    Visualising and DetectingSpatio-temporal Patterns with Animated Map 10 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 11.
    Visualising and DetectingSpatio-Temporal Patterns with Trellis Maps 11 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 12.
    Visualising and DetectingSpatio-Temporal Point Patterns 12 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 13.
    Visualising and DetectingSpatio-Temporal Point Patterns with Density Map 13 Copyright © 2011, SAS Institute Inc. All rights reserved.
  • 14.
    Q&A Copyright © 2011,SAS Institute Inc. All rights reserved.