SlideShare a Scribd company logo
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Organization Spatial Data Mining Fall 2011
1. Introduction
2. Region Discovery—Finding Interesting Places in Spatial
Datasets
3. Project3
4. CLEVER: a Spatial Clustering Algorithm Supporting Plug-in
Fitness Functions
5. [Spatial Regression]
Brief Introduction
to Spatial Data Mining
Spatial data mining is the process of discovering
interesting, useful, non-trivial patterns from large spatial
datasets
Reading Material: http://en.wikipedia.org/wiki/Spatial_analysis
Spatial Statistics Software: http://www.spatial-statistics.com/
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Examples of Spatial Patterns
Historic Examples (section 7.1.5, pp. 186)
1855 Asiatic Cholera in London: A water pump identified as the source
Fluoride and healthy gums near Colorado river
Theory of Gondwanaland - continents fit like pieces of a jigsaw puzlle
Modern Examples
Cancer clusters to investigate environment health hazards
Crime hotspots for planning police patrol routes
Bald eagles nest on tall trees near open water
Nile virus spreading from north east USA to south and west
Unusual warming of Pacific ocean (El Nino) affects weather in USA
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Why Learn about Spatial Data Mining?
Two basic reasons for new work
Consideration of use in certain application domains
Provide fundamental new understanding
Application domains
Scale up secondary spatial (statistical) analysis to very large datasets
• Describe/explain locations of human settlements in last 5000 years
• Find cancer clusters to locate hazardous environments
• Prepare land-use maps from satellite imagery
• Predict habitat suitable for endangered species
Find new spatial patterns
• Find groups of co-located geographic features
Exercise. Name 2 application domains not listed above.
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Why Learn about Spatial Data Mining? - 2
New understanding of geographic processes for Critical questions
Ex. How is the health of planet Earth?
Ex. Characterize effects of human activity on environment and ecology
Ex. Predict effect of El Nino on weather, and economy
Traditional approach: manually generate and test hypothesis
But, spatial data is growing too fast to analyze manually
• Satellite imagery, GPS tracks, sensors on highways, …
Number of possible geographic hypothesis too large to explore manually
• Large number of geographic features and locations
• Number of interacting subsets of features grow exponentially
• Ex. Find tele connections between weather events across ocean and land areas
SDM may reduce the set of plausible hypothesis
Identify hypothesis supported by the data
For further exploration using traditional statistical methods
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Autocorrelation
Items in a traditional data are independent of each other,
whereas properties of locations in a map are often “auto-correlated”.
First law of geography [Tobler]:
Everything is related to everything, but nearby things are more related
than distant things.
People with similar backgrounds tend to live in the same area
Economies of nearby regions tend to be similar
Changes in temperature occur gradually over space(and time)
Waldo Tobler in 2000
Papers on “Laws in Geography”: http://www.geog.ucsb.edu/~good/papers/393.pdf
http://www.cs.uh.edu/~ceick/DM/GOO10.pdf
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Characteristics of Spatial Data Mining
Auto correlation
Patterns usually have to be defined in the spatial attribute subspace
and not in the complete attribute space
Longitude and latitude (or other coordinate systems) are the glue that
link different data collections together
People are used to maps in GIS; therefore, data mining results have
to be summarized on the top of maps
Patterns not only refer to points, but can also refer to lines, or
polygons or other higher order geometrical objects
Patterns exist at different levels of granularity
Large number of patterns, large dataset sizes
Spatial patterns, e.g. spatial clusters can have arbitrary shapes
Regional knowledge is of particular importance due to lack of global
knowledge in geography (spatial heterogeniety)
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Why Regional Knowledge Important in Spatial Data Mining?
A special challenge in spatial data mining is that
information is usually not uniformly distributed in spatial
datasets.
It has been pointed out in the literature that “whole map
statistics are seldom useful”, that “most relationships in
spatial data sets are geographically regional, rather than
global”, and that “there is no average place on the Earth’s
surface” [Goodchild03, Openshaw99].
Therefore, it is not surprising that domain experts are
mostly interested in discovering hidden patterns at a
regional scale rather than a global scale.
Michael Frank Goodchild
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Spatial Autocorrelation: Distance-based measure
K-function Definition (http://dhf.ddc.moph.go.th/abstract/s22.pdf )
Test against randomness for point pattern
• λ is intensity of event
Model departure from randomness in a wide range of scales
Inference
For Poisson complete spatial randomness (CSR): K(h) = πh2
Plot Khat(h) against h, compare to Poisson CSR
• >: cluster
• <: decluster/regularity
EhK 1
)( 
  [number of events within distance h of an arbitrary event]
K-Function based Spatial Autocorrelation
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Answers: and
find patterns from the following sample dataset?
Associations, Spatial associations, Co-location
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Illustration of Cross-Correlation
Illustration of Cross K-function for Example Data
Cross-K Function for Example Data
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Colocation Rules – Spatial Interest Measures
http://www.youtube.com/watch?v=RPyJwYqyBuI
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Cross-Correlation
Cross K-Function Definition
Cross K-function of some pair of spatial feature types
Example
• Which pairs are frequently co-located
• Statistical significance
EhK jji
1
)( 
  [number of type j event within distance h of a randomly chosen
type i event]
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Spatial Association Rules
•Spatial Association Rules
• A special reference spatial feature
• Transactions are defined around instance of special spatial feature
• Item-types = spatial predicates
•Example: Table 7.5 (pp. 204)
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Participation index = min{pr(fi, c)}
Where pr(fi, c) of feature fi in co-location c = {f1, f2, …, fk}:
= fraction of instances of fi with feature {f1, …, fi-1, fi+1, …, fk} nearby
N(L) = neighborhood of location L
Pr.[ A in N(L) | B at location L ]Pr.[ A in T | B in T ]conditional probability metric
Neighborhood (N)Transaction (T)collection
events /Boolean spatial featuresitem-typesitem-types
support
discrete sets
Association rules Co-location rules
participation indexprevalence measure
continuous spaceUnderlying space
Co-location rules vs. traditional association rules
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Conclusions Spatial Data Mining
Spatial patterns are opposite of random
Common spatial patterns: location prediction, feature interaction, hot spots,
geographically referenced statistical patterns, co-location, emergent patterns,…
SDM = search for unexpected interesting patterns in large spatial databases
Spatial patterns may be discovered using
Techniques like classification, associations, clustering and outlier detection
New techniques are needed for SDM due to
• Spatial Auto-correlation
• Importance of non-point data types (e.g. polygons)
• Continuity of space
• Regional knowledge; also establishes a need for scoping
• Separation between spatial and non-spatial subspace—in traditional
approaches clusters are usually defined over the complete attribute space
Knowledge sources are available now
Raw knowledge to perform spatial data mining is mostly available online now
(e.g. relational databases, Google Earth)
GIS tools are available that facilitate integrating knowledge from different
source
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Spatial Regression
Spatial Regression.pptx
Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN))
Example Videos Discussing Spatial Analysis
http://www.esri.com/what-is-gis/index.html What is GIS?
http://www.youtube.com/watch?v=ZqMul3OIQNI&feature=related (Geo-
graphically weighted regression software advertisement video)
http://www.youtube.com/watch?v=RhDdtqgIy9Q&feature=related
(Spatial Analysis and Remote Sensing Degree at UA)
http://www.youtube.com/watch?v=_SBLBkP9O9I&feature=related
ArcGIS Spatial Analyst Overview
http://www.youtube.com/watch?v=mBSXBqEP-7Y&feature=related
(ArcGIS 9.3: Advanced planning and analysis - Part 1)
http://www.youtube.com/watch?v=agzjyi0rnOo&feature=related
(Example using Spatial Analysis to Analyze Medical Data; the video is not
really that “great”; if you know a better one share it with us!
http://acmgis2011.cs.umn.edu/ ACM GIS Conference, discusses advances
in Geographical Information Systems and related areas
http://www.houstonareagisday.org/ Houston Area GIS Day Nov. 10, 2011

More Related Content

What's hot

Spatial databases
Spatial databasesSpatial databases
Spatial databases
Neha Kulkarni
 
Spatial interpolation techniques
Spatial interpolation techniquesSpatial interpolation techniques
Spatial interpolation techniques
Manisha Shrivastava
 
Georeferencing
GeoreferencingGeoreferencing
Georeferencing
Reham Maher El-Safarini
 
GIS data structure
GIS data structureGIS data structure
GIS data structure
Thana Chirapiwat
 
Data Models - GIS I
Data Models - GIS IData Models - GIS I
Data Models - GIS IJohn Reiser
 
Geographic Phenomena and their Representations
Geographic Phenomena and their RepresentationsGeographic Phenomena and their Representations
Geographic Phenomena and their Representations
NAXA-Developers
 
SPATIAL DATABASES.pptx
SPATIAL DATABASES.pptxSPATIAL DATABASES.pptx
SPATIAL DATABASES.pptx
AmanSingla57
 
A Journey to the World of GIS
A Journey to the World of GISA Journey to the World of GIS
A Journey to the World of GIS
Nishant Sinha
 
Spatial Data Model
Spatial Data ModelSpatial Data Model
Spatial Data Model
Kaium Chowdhury
 
Mining Data Streams
Mining Data StreamsMining Data Streams
Mining Data Streams
SujaAldrin
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Spatial data analysis
Spatial data analysisSpatial data analysis
Spatial data analysis
Johan Blomme
 
Fundamentals of GIS
Fundamentals of GISFundamentals of GIS
Fundamentals of GIS
RajalakshmiS34
 
My ppt on gis
My ppt on gisMy ppt on gis
My ppt on gis
gargsonakshi1
 
QUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GISQUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GIS
DEVANG KAPADIA
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
Krish_ver2
 
Lecture 7: Participatory GIS for Disaster Management
Lecture 7: Participatory GIS for Disaster ManagementLecture 7: Participatory GIS for Disaster Management
Lecture 7: Participatory GIS for Disaster Management
ESD UNU-IAS
 
TYBSC IT PGIS Unit III Chapter II Data Entry and Preparation
TYBSC IT PGIS Unit III Chapter II Data Entry and PreparationTYBSC IT PGIS Unit III Chapter II Data Entry and Preparation
TYBSC IT PGIS Unit III Chapter II Data Entry and Preparation
Arti Parab Academics
 
Geo-spatial Analysis and Modelling
Geo-spatial Analysis and ModellingGeo-spatial Analysis and Modelling
Geo-spatial Analysis and Modelling
Malla Reddy University
 
DATA in GIS and DATA Query
DATA in GIS and DATA QueryDATA in GIS and DATA Query
DATA in GIS and DATA Query
KU Leuven
 

What's hot (20)

Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
Spatial interpolation techniques
Spatial interpolation techniquesSpatial interpolation techniques
Spatial interpolation techniques
 
Georeferencing
GeoreferencingGeoreferencing
Georeferencing
 
GIS data structure
GIS data structureGIS data structure
GIS data structure
 
Data Models - GIS I
Data Models - GIS IData Models - GIS I
Data Models - GIS I
 
Geographic Phenomena and their Representations
Geographic Phenomena and their RepresentationsGeographic Phenomena and their Representations
Geographic Phenomena and their Representations
 
SPATIAL DATABASES.pptx
SPATIAL DATABASES.pptxSPATIAL DATABASES.pptx
SPATIAL DATABASES.pptx
 
A Journey to the World of GIS
A Journey to the World of GISA Journey to the World of GIS
A Journey to the World of GIS
 
Spatial Data Model
Spatial Data ModelSpatial Data Model
Spatial Data Model
 
Mining Data Streams
Mining Data StreamsMining Data Streams
Mining Data Streams
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Spatial data analysis
Spatial data analysisSpatial data analysis
Spatial data analysis
 
Fundamentals of GIS
Fundamentals of GISFundamentals of GIS
Fundamentals of GIS
 
My ppt on gis
My ppt on gisMy ppt on gis
My ppt on gis
 
QUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GISQUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GIS
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Lecture 7: Participatory GIS for Disaster Management
Lecture 7: Participatory GIS for Disaster ManagementLecture 7: Participatory GIS for Disaster Management
Lecture 7: Participatory GIS for Disaster Management
 
TYBSC IT PGIS Unit III Chapter II Data Entry and Preparation
TYBSC IT PGIS Unit III Chapter II Data Entry and PreparationTYBSC IT PGIS Unit III Chapter II Data Entry and Preparation
TYBSC IT PGIS Unit III Chapter II Data Entry and Preparation
 
Geo-spatial Analysis and Modelling
Geo-spatial Analysis and ModellingGeo-spatial Analysis and Modelling
Geo-spatial Analysis and Modelling
 
DATA in GIS and DATA Query
DATA in GIS and DATA QueryDATA in GIS and DATA Query
DATA in GIS and DATA Query
 

Viewers also liked

Spatial Data Mining : Seminar
Spatial Data Mining : SeminarSpatial Data Mining : Seminar
Spatial Data Mining : SeminarIpsit Dash
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
Salford Systems
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
Dung Nguyen
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Crop production ppt
Crop production pptCrop production ppt
Crop production ppt
vaggyaggy
 

Viewers also liked (7)

Spatial Data Mining : Seminar
Spatial Data Mining : SeminarSpatial Data Mining : Seminar
Spatial Data Mining : Seminar
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Temporal data mining
Temporal data miningTemporal data mining
Temporal data mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 
Crop production ppt
Crop production pptCrop production ppt
Crop production ppt
 

Similar to Introduction to spatial data mining

dm_spdm_short.ppt
dm_spdm_short.pptdm_spdm_short.ppt
dm_spdm_short.ppt
SahilShahPhD2020
 
dm_spdm_short (3).ppt
dm_spdm_short (3).pptdm_spdm_short (3).ppt
dm_spdm_short (3).ppt
yakot2alordea2
 
Towards a graph of ancient world geographical knowledge
Towards a graph of ancient world geographical knowledgeTowards a graph of ancient world geographical knowledge
Towards a graph of ancient world geographical knowledge
Elton Barker
 
AAG_2011
AAG_2011AAG_2011
AAG_2011
ohuisman
 
5.1 major analytical techniques
5.1 major analytical techniques5.1 major analytical techniques
5.1 major analytical techniques
md Siraj
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
Mason Porter
 
Adamas university2018 f
Adamas university2018 fAdamas university2018 f
Adamas university2018 f
Prof Ashis Sarkar
 
chapter 2 the nature.ppt
chapter 2 the nature.pptchapter 2 the nature.ppt
chapter 2 the nature.ppt
Mohammed Waseeque sheraz
 
Chapter 1 basic concepts
Chapter 1 basic conceptsChapter 1 basic concepts
Chapter 1 basic conceptsbryankrouse
 
Steps Towards a History of Ethnomethodology in HCI
Steps Towards a History of Ethnomethodology in HCI Steps Towards a History of Ethnomethodology in HCI
Steps Towards a History of Ethnomethodology in HCI butest
 
Kuchler Maps Project Report (2)
Kuchler Maps Project Report (2)Kuchler Maps Project Report (2)
Kuchler Maps Project Report (2)Courtney Misich
 
How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...
Andrea Scharnhorst
 
Geographic Information Management Transformation
Geographic Information Management TransformationGeographic Information Management Transformation
Geographic Information Management Transformation
Pat Kenny
 
Coates: topological approximations for spatial representation
Coates: topological approximations for spatial representationCoates: topological approximations for spatial representation
Coates: topological approximations for spatial representationArchiLab 7
 
Why do we need to model the science system?
Why do we need to model the science system?Why do we need to model the science system?
Why do we need to model the science system?
Andrea Scharnhorst
 
Research Issues and Concerns
Research Issues and ConcernsResearch Issues and Concerns
Research Issues and Concerns
Prof Ashis Sarkar
 
05 astrostat feigelson
05 astrostat feigelson05 astrostat feigelson
05 astrostat feigelson
Marco Quartulli
 
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...Knowledge – dynamics – landscape - navigation – what have interfaces to digit...
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...
Andrea Scharnhorst
 
Sampling and Probability in Geography
Sampling and Probability in Geography Sampling and Probability in Geography
Sampling and Probability in Geography
Prof Ashis Sarkar
 

Similar to Introduction to spatial data mining (20)

dm_spdm_short.ppt
dm_spdm_short.pptdm_spdm_short.ppt
dm_spdm_short.ppt
 
dm_spdm_short (3).ppt
dm_spdm_short (3).pptdm_spdm_short (3).ppt
dm_spdm_short (3).ppt
 
Towards a graph of ancient world geographical knowledge
Towards a graph of ancient world geographical knowledgeTowards a graph of ancient world geographical knowledge
Towards a graph of ancient world geographical knowledge
 
AAG_2011
AAG_2011AAG_2011
AAG_2011
 
5.1 major analytical techniques
5.1 major analytical techniques5.1 major analytical techniques
5.1 major analytical techniques
 
Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Adamas university2018 f
Adamas university2018 fAdamas university2018 f
Adamas university2018 f
 
chapter 2 the nature.ppt
chapter 2 the nature.pptchapter 2 the nature.ppt
chapter 2 the nature.ppt
 
Chapter 1 basic concepts
Chapter 1 basic conceptsChapter 1 basic concepts
Chapter 1 basic concepts
 
Steps Towards a History of Ethnomethodology in HCI
Steps Towards a History of Ethnomethodology in HCI Steps Towards a History of Ethnomethodology in HCI
Steps Towards a History of Ethnomethodology in HCI
 
Kuchler Maps Project Report (2)
Kuchler Maps Project Report (2)Kuchler Maps Project Report (2)
Kuchler Maps Project Report (2)
 
How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...
 
Geographic Information Management Transformation
Geographic Information Management TransformationGeographic Information Management Transformation
Geographic Information Management Transformation
 
Coates: topological approximations for spatial representation
Coates: topological approximations for spatial representationCoates: topological approximations for spatial representation
Coates: topological approximations for spatial representation
 
Why do we need to model the science system?
Why do we need to model the science system?Why do we need to model the science system?
Why do we need to model the science system?
 
10.1.1.17.1245
10.1.1.17.124510.1.1.17.1245
10.1.1.17.1245
 
Research Issues and Concerns
Research Issues and ConcernsResearch Issues and Concerns
Research Issues and Concerns
 
05 astrostat feigelson
05 astrostat feigelson05 astrostat feigelson
05 astrostat feigelson
 
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...Knowledge – dynamics – landscape - navigation – what have interfaces to digit...
Knowledge – dynamics – landscape - navigation – what have interfaces to digit...
 
Sampling and Probability in Geography
Sampling and Probability in Geography Sampling and Probability in Geography
Sampling and Probability in Geography
 

More from Hoang Nguyen

Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your site
Hoang Nguyen
 
How to build a rest api
How to build a rest apiHow to build a rest api
How to build a rest api
Hoang Nguyen
 
Api crash
Api crashApi crash
Api crash
Hoang Nguyen
 
Smm and caching
Smm and cachingSmm and caching
Smm and caching
Hoang Nguyen
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
Hoang Nguyen
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
Hoang Nguyen
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
Hoang Nguyen
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
Hoang Nguyen
 
Cache recap
Cache recapCache recap
Cache recap
Hoang Nguyen
 
Python your new best friend
Python your new best friendPython your new best friend
Python your new best friend
Hoang Nguyen
 
Python language data types
Python language data typesPython language data types
Python language data types
Hoang Nguyen
 
Python basics
Python basicsPython basics
Python basics
Hoang Nguyen
 
Programming for engineers in python
Programming for engineers in pythonProgramming for engineers in python
Programming for engineers in python
Hoang Nguyen
 
Learning python
Learning pythonLearning python
Learning python
Hoang Nguyen
 
Extending burp with python
Extending burp with pythonExtending burp with python
Extending burp with python
Hoang Nguyen
 
Cobol, lisp, and python
Cobol, lisp, and pythonCobol, lisp, and python
Cobol, lisp, and python
Hoang Nguyen
 
Object oriented programming using c++
Object oriented programming using c++Object oriented programming using c++
Object oriented programming using c++
Hoang Nguyen
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
Hoang Nguyen
 
Object model
Object modelObject model
Object model
Hoang Nguyen
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
Hoang Nguyen
 

More from Hoang Nguyen (20)

Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your site
 
How to build a rest api
How to build a rest apiHow to build a rest api
How to build a rest api
 
Api crash
Api crashApi crash
Api crash
 
Smm and caching
Smm and cachingSmm and caching
Smm and caching
 
Optimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessorsOptimizing shared caches in chip multiprocessors
Optimizing shared caches in chip multiprocessors
 
How analysis services caching works
How analysis services caching worksHow analysis services caching works
How analysis services caching works
 
Hardware managed cache
Hardware managed cacheHardware managed cache
Hardware managed cache
 
Directory based cache coherence
Directory based cache coherenceDirectory based cache coherence
Directory based cache coherence
 
Cache recap
Cache recapCache recap
Cache recap
 
Python your new best friend
Python your new best friendPython your new best friend
Python your new best friend
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python basics
Python basicsPython basics
Python basics
 
Programming for engineers in python
Programming for engineers in pythonProgramming for engineers in python
Programming for engineers in python
 
Learning python
Learning pythonLearning python
Learning python
 
Extending burp with python
Extending burp with pythonExtending burp with python
Extending burp with python
 
Cobol, lisp, and python
Cobol, lisp, and pythonCobol, lisp, and python
Cobol, lisp, and python
 
Object oriented programming using c++
Object oriented programming using c++Object oriented programming using c++
Object oriented programming using c++
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
Object model
Object modelObject model
Object model
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 

Recently uploaded

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 

Recently uploaded (20)

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 

Introduction to spatial data mining

  • 1. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Organization Spatial Data Mining Fall 2011 1. Introduction 2. Region Discovery—Finding Interesting Places in Spatial Datasets 3. Project3 4. CLEVER: a Spatial Clustering Algorithm Supporting Plug-in Fitness Functions 5. [Spatial Regression]
  • 2. Brief Introduction to Spatial Data Mining Spatial data mining is the process of discovering interesting, useful, non-trivial patterns from large spatial datasets Reading Material: http://en.wikipedia.org/wiki/Spatial_analysis Spatial Statistics Software: http://www.spatial-statistics.com/
  • 3. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Examples of Spatial Patterns Historic Examples (section 7.1.5, pp. 186) 1855 Asiatic Cholera in London: A water pump identified as the source Fluoride and healthy gums near Colorado river Theory of Gondwanaland - continents fit like pieces of a jigsaw puzlle Modern Examples Cancer clusters to investigate environment health hazards Crime hotspots for planning police patrol routes Bald eagles nest on tall trees near open water Nile virus spreading from north east USA to south and west Unusual warming of Pacific ocean (El Nino) affects weather in USA
  • 4. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Why Learn about Spatial Data Mining? Two basic reasons for new work Consideration of use in certain application domains Provide fundamental new understanding Application domains Scale up secondary spatial (statistical) analysis to very large datasets • Describe/explain locations of human settlements in last 5000 years • Find cancer clusters to locate hazardous environments • Prepare land-use maps from satellite imagery • Predict habitat suitable for endangered species Find new spatial patterns • Find groups of co-located geographic features Exercise. Name 2 application domains not listed above.
  • 5. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Why Learn about Spatial Data Mining? - 2 New understanding of geographic processes for Critical questions Ex. How is the health of planet Earth? Ex. Characterize effects of human activity on environment and ecology Ex. Predict effect of El Nino on weather, and economy Traditional approach: manually generate and test hypothesis But, spatial data is growing too fast to analyze manually • Satellite imagery, GPS tracks, sensors on highways, … Number of possible geographic hypothesis too large to explore manually • Large number of geographic features and locations • Number of interacting subsets of features grow exponentially • Ex. Find tele connections between weather events across ocean and land areas SDM may reduce the set of plausible hypothesis Identify hypothesis supported by the data For further exploration using traditional statistical methods
  • 6. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Autocorrelation Items in a traditional data are independent of each other, whereas properties of locations in a map are often “auto-correlated”. First law of geography [Tobler]: Everything is related to everything, but nearby things are more related than distant things. People with similar backgrounds tend to live in the same area Economies of nearby regions tend to be similar Changes in temperature occur gradually over space(and time) Waldo Tobler in 2000 Papers on “Laws in Geography”: http://www.geog.ucsb.edu/~good/papers/393.pdf http://www.cs.uh.edu/~ceick/DM/GOO10.pdf
  • 7. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Characteristics of Spatial Data Mining Auto correlation Patterns usually have to be defined in the spatial attribute subspace and not in the complete attribute space Longitude and latitude (or other coordinate systems) are the glue that link different data collections together People are used to maps in GIS; therefore, data mining results have to be summarized on the top of maps Patterns not only refer to points, but can also refer to lines, or polygons or other higher order geometrical objects Patterns exist at different levels of granularity Large number of patterns, large dataset sizes Spatial patterns, e.g. spatial clusters can have arbitrary shapes Regional knowledge is of particular importance due to lack of global knowledge in geography (spatial heterogeniety)
  • 8. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Why Regional Knowledge Important in Spatial Data Mining? A special challenge in spatial data mining is that information is usually not uniformly distributed in spatial datasets. It has been pointed out in the literature that “whole map statistics are seldom useful”, that “most relationships in spatial data sets are geographically regional, rather than global”, and that “there is no average place on the Earth’s surface” [Goodchild03, Openshaw99]. Therefore, it is not surprising that domain experts are mostly interested in discovering hidden patterns at a regional scale rather than a global scale. Michael Frank Goodchild
  • 9. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Spatial Autocorrelation: Distance-based measure K-function Definition (http://dhf.ddc.moph.go.th/abstract/s22.pdf ) Test against randomness for point pattern • λ is intensity of event Model departure from randomness in a wide range of scales Inference For Poisson complete spatial randomness (CSR): K(h) = πh2 Plot Khat(h) against h, compare to Poisson CSR • >: cluster • <: decluster/regularity EhK 1 )(    [number of events within distance h of an arbitrary event] K-Function based Spatial Autocorrelation
  • 10. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Answers: and find patterns from the following sample dataset? Associations, Spatial associations, Co-location
  • 11. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Illustration of Cross-Correlation Illustration of Cross K-function for Example Data Cross-K Function for Example Data
  • 12. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Colocation Rules – Spatial Interest Measures http://www.youtube.com/watch?v=RPyJwYqyBuI
  • 13. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Cross-Correlation Cross K-Function Definition Cross K-function of some pair of spatial feature types Example • Which pairs are frequently co-located • Statistical significance EhK jji 1 )(    [number of type j event within distance h of a randomly chosen type i event]
  • 14. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Spatial Association Rules •Spatial Association Rules • A special reference spatial feature • Transactions are defined around instance of special spatial feature • Item-types = spatial predicates •Example: Table 7.5 (pp. 204)
  • 15. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Participation index = min{pr(fi, c)} Where pr(fi, c) of feature fi in co-location c = {f1, f2, …, fk}: = fraction of instances of fi with feature {f1, …, fi-1, fi+1, …, fk} nearby N(L) = neighborhood of location L Pr.[ A in N(L) | B at location L ]Pr.[ A in T | B in T ]conditional probability metric Neighborhood (N)Transaction (T)collection events /Boolean spatial featuresitem-typesitem-types support discrete sets Association rules Co-location rules participation indexprevalence measure continuous spaceUnderlying space Co-location rules vs. traditional association rules
  • 16. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Conclusions Spatial Data Mining Spatial patterns are opposite of random Common spatial patterns: location prediction, feature interaction, hot spots, geographically referenced statistical patterns, co-location, emergent patterns,… SDM = search for unexpected interesting patterns in large spatial databases Spatial patterns may be discovered using Techniques like classification, associations, clustering and outlier detection New techniques are needed for SDM due to • Spatial Auto-correlation • Importance of non-point data types (e.g. polygons) • Continuity of space • Regional knowledge; also establishes a need for scoping • Separation between spatial and non-spatial subspace—in traditional approaches clusters are usually defined over the complete attribute space Knowledge sources are available now Raw knowledge to perform spatial data mining is mostly available online now (e.g. relational databases, Google Earth) GIS tools are available that facilitate integrating knowledge from different source
  • 17. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Spatial Regression Spatial Regression.pptx
  • 18. Ch. Eick: Spatial Data Mining (inspired by a talk given at UH by Shashi Shekhar (UMN)) Example Videos Discussing Spatial Analysis http://www.esri.com/what-is-gis/index.html What is GIS? http://www.youtube.com/watch?v=ZqMul3OIQNI&feature=related (Geo- graphically weighted regression software advertisement video) http://www.youtube.com/watch?v=RhDdtqgIy9Q&feature=related (Spatial Analysis and Remote Sensing Degree at UA) http://www.youtube.com/watch?v=_SBLBkP9O9I&feature=related ArcGIS Spatial Analyst Overview http://www.youtube.com/watch?v=mBSXBqEP-7Y&feature=related (ArcGIS 9.3: Advanced planning and analysis - Part 1) http://www.youtube.com/watch?v=agzjyi0rnOo&feature=related (Example using Spatial Analysis to Analyze Medical Data; the video is not really that “great”; if you know a better one share it with us! http://acmgis2011.cs.umn.edu/ ACM GIS Conference, discusses advances in Geographical Information Systems and related areas http://www.houstonareagisday.org/ Houston Area GIS Day Nov. 10, 2011