SlideShare a Scribd company logo
An Efficient Two-Step Method for
Classification of Spatial Data
Authors : Krzysztof Koperski, Jiawei Han, Nebojsa Stefanovic
Presented on : Spatial Data Handling (SDH’ 98)
Reviewed by: Abhishek Agrawal
Introduction
• In spatial databases very large amounts of Spatial Data have been collected used in various
applications ranging from remote sensing to geographical information systems (GIS), computer
cartography, environmental assessment and planning etc.
• These spatial databases contains many hidden and interesting implicit spatial relations and
patterns which are extracted which are not explicitly stored in such databases.
• One of the spatial data mining techniques is the classification of the spatial objects stored in the
spatial databases where the objective is to label different spatial objects by identifying set of rules
that can describe the partition.
Classification Approach : Spatial Decision Tree
❖ In this paper[1], authors have used decision tree to classify spatial objects based on
➢ Non-Spatial properties of the classified objects (Traditional)
➢ Spatial relations of the classified objects to other objects in the database
❖ Also, authors have analyzed the problem of classification of spatial objects in relevance to
thematic maps and and spatial relationships to other objects in the database.
❖ With the new approach of spatial classification using decision tree, authors provided the
experimental results of both real and synthetic data to compare the performance and quality of the
results with other existing methods in the same problem space.
Business Problem: Label the local business units such as shopping malls
or stores based on their business profit status based on the influence of
their trade area.
Problem Definition
Problem Definition Continue..
Data Mining Problem: Classification of spatial objects such as shopping
malls or stores defined by its attributes, that belong to two or different
classes Y and N which are selected based on attribute high_profit with two
values Y for “yes” and N for “no”.
● In our example, objects OID1 and OID2
belong to class Y and objects OID3,
OID4 and OID5 belong to class N.
● In our example, objects OID1 and OID2
belong to class Y and objects OID3,
OID4 and OID5 belong to class N.
● We want to build a decision tree
classifying objects Oi based on two
types of information:
➢ descriptions of the objects in the
proximity of objects Oi
● In our example, objects OID1 and OID2
belong to class Y and objects OID3,
OID4 and OID5 belong to class N.
● We want to build a decision tree
classifying objects Oi based on two
types of information:
➢ descriptions of the objects in the
proximity of objects Oi
➢ non-spatial attributes of the
thematic map
State of the Art
● Fayyad et. al.[2] used decision tree methods to classify images of stellar objects to detect stars
and galaxies. They used low-level image processing system FOCAS to select and generate basic
attributes. The proposed method deals with image databases and is tailored for the astronomical
attributes which is not suitable for vector data format (GIS Database) .
● Another approach, Ester et. al.[3], based on ID3 algorithm and uses the concept of neighbourhood
graphs. This method doesn’t analyze aggregate values of non-spatial attributes for the
neighbouring objects. Similarly it doesn’t perform any relevance analysis for narrowing its search
space.
● Ng and Yu[4] described a method for the extraction of strong, common and discriminating
characteristics of clusters based on the thematic map. They have not extended the result
characteristics of thematic map to construct decision trees.
Classification Algorithm
Building a decision tree to classify spatial object based on spatial predicates, functions and
thematic maps.
Input :
1. Spatial Database containing:
a. classified objects Oc
b. other spatial objects with non-spatial attributes
2. Geo-mining query specifying:
a. objects to be used, predictive attributes, predicates and functions
b. attribute, predicate or function used as a class label
Output :
Binary Decision Tree
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
b. Perform generalization of the sets of predicates based on concept hierarchies
c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm
3. Find the best size for the buffer for aggregates of thematic map polygons. It is done by finding for
all relevant non-spatial aggregate attributes the size of the buffer Xmax where the information gain
for the aggregated attribute is maximum.
4. Build sets of predicates using relevant fine predicates and generalize based on concept
hierarchies.
5. Generate Decision Tree
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
Step1.a : Define MBR(Minimum Bounding Rect.)
using data distribution and confidence
level as threshold.
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
Step 2.a : Find coarse description for the sample to
list the spatial attributes, functions etc.
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
b. Perform generalization of the sets of predicates based on concept hierarchies
Step 2.b : Generalize the predicates using concept
hierarchies
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
b. Perform generalization of the sets of predicates based on concept hierarchies
c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm
RELIEF ALGORITHM
Find Relevant Attributes
Step 2.c : For every object s in the sample two nearest neighbours are found,
where one neighbour belongs to the same class(Y/N) as object s (nearest hit)
and other neighbour belongs to a class different than s (nearest miss)
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
b. Perform generalization of the sets of predicates based on concept hierarchies
c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm
RELIEF ALGORITHM
Find Relevant Attributes
Step 2.c : Give weights to the predicate based on neighbourhood predicates:
➔ For nearest hit, if it has the same predicate value, then weight for this predicate increases ↑
➔ For nearest hit, if it has the different predicate value, then weight for this predicate decreases ↓
➔ For nearest miss, if it has the same predicate value, then weight for this predicate decreases ↓
➔ For nearest miss, if it has the different predicate value, then weight for this predicate increases ↑
Now based on weight > threshold, we select the relevant predicates
Method: Spatial Decision Tree
Method: Spatial Decision Tree
1. Collect a set S of classified objects and other objects that are used for description
2. For the sample of spatial object Oc from S:
a. Build sets of predicates describing all objects using coarse predicates, functions and
attributes.
b. Perform generalization of the sets of predicates based on concept hierarchies
c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm
3. Find the best size for the buffer for aggregates of thematic map polygons. It is done by finding for
all relevant non-spatial aggregate attributes the size of the buffer Xmax where the information gain
for the aggregated attribute is maximum.
4. Build sets of predicates using relevant fine predicates and generalize based on concept
hierarchies.
5. Generate Decision Tree
Method: Spatial Decision Tree
Step 3: Find the best size for the buffer for aggregates of thematic map
polygons.
• Now for the shape of the buffer, different criteria
may be used. The buffers may be based on
rings or customer penetration polygons.
• The rings have some advantages:
1. ease of use,
2. no need to determine trade area based on
customer data
1. easy comparison between sites
Method: Spatial Decision Tree
Step 3: Find the best size for the buffer for aggregates of thematic map
polygons.
• Buffers represents area that have an impact
on class label attribute of classified objects.
• The size of buffer is fixed by finding for all
relevant non-spatial aggregate attributes,
the size of the buffer Xmax where the
information gain for the aggregated attribute
is maximum.
Method: Spatial Decision Tree
Step 4 : Build sets of predicates using relevant fine predicates and
generalize based on concept hierarchies.
Method: Spatial Decision Tree
Step 4 : Build sets of predicates using relevant fine predicates and
generalize based on concept hierarchies.
Method: Spatial Decision Tree
Step 5 : Build Decision Tree
Method: Spatial Decision Tree
Step 5 : Build Decision Tree : Binary Split ( Based on Info gain )
Complexity Analysis
Complexity Analysis:
Results & Performance Evaluation
• Experiments were performed on synthetic data merge with TIGER U.S. census data for
washington state.
• With real data, best results were found with threshold between 0 to 0.2 and accuracy drastically
increased when relevance analysis was used.
Conclusion and Future Directions
• Classification of geographical objects enables researcher to explore
interesting relations between spatial and non-spatial data.
• The algorithm performs less costly, approximate spatial computations,
relevance analyses for producing smaller and more accurate decision trees.
• The pre-computed spatial indexes can be stored as part of regular spatial
query to find neighbourhood attributes.
• Authors plan to perform experiments using aggregate values for thematic
maps and by varying distance for close_to spatial predicates.
• Integrate with their spatial data mining prototype GeoMiner
References
[1] Koperski, Krzysztof, Jiawei Han, and Nebojsa Stefanovic. "An efficient two-step method for
classification of spatial data." proceedings of International Symposium on Spatial Data Handling
(SDH’98). 1998.
[2] Fayyad, Usama M., S. George Djorgovski, and Nicholas Weir. "Automating the analysis and
cataloging of sky surveys." Advances in knowledge discovery and data mining. American
Association for Artificial Intelligence, 1996.
[3] Ester, Martin, Hans-Peter Kriegel, and Jörg Sander. "Spatial data mining: A database approach."
Advances in spatial databases. Springer Berlin Heidelberg, 1997.
[4] Ng, R. T., and Y. Yu Discovering Strong. "Common and Discriminating Characteristics of Clusters
from Thematic Maps." Proc. of the 11th Annual Symp. on Geographic Information Systems. 1997.

More Related Content

What's hot

My8clst
My8clstMy8clst
My8clst
ketan533
 
What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysis
Prabhat gangwar
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
rajshreemuthiah
 
Data clustering
Data clustering Data clustering
Data clustering
GARIMA SHAKYA
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
SSA KPI
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis Introduction
PrasiddhaSarma
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
SUBBIAH SURESH
 
Machine Learning Project
Machine Learning ProjectMachine Learning Project
Machine Learning ProjectAdeyemi Fowe
 
K means Clustering
K means ClusteringK means Clustering
K means ClusteringEdureka!
 

What's hot (11)

My8clst
My8clstMy8clst
My8clst
 
What is cluster analysis
What is cluster analysisWhat is cluster analysis
What is cluster analysis
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
Data clustering
Data clustering Data clustering
Data clustering
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Cluster Analysis Introduction
Cluster Analysis IntroductionCluster Analysis Introduction
Cluster Analysis Introduction
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 
Lect4
Lect4Lect4
Lect4
 
Machine Learning Project
Machine Learning ProjectMachine Learning Project
Machine Learning Project
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
K means Clustering
K means ClusteringK means Clustering
K means Clustering
 

Viewers also liked

Classification Using Decision tree
Classification Using Decision treeClassification Using Decision tree
Classification Using Decision tree
Mohd. Noor Abdul Hamid
 
Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health Data
Abhishek Agrawal
 
Classification ANN
Classification ANNClassification ANN
Classification ANN
Vinaytosh Mishra
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
Prakash Pimpale
 
Customer Centric Data Mining
Customer Centric Data MiningCustomer Centric Data Mining
Customer Centric Data Mininganjeshdubey
 
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
shibbirtanvin
 
Artificial Neural Network
Artificial Neural Network Artificial Neural Network
Artificial Neural Network
Iman Ardekani
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
Pallavi Yadav
 
Artificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural NetworksArtificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural Networks
The Integral Worm
 
Decision tree
Decision treeDecision tree
Decision tree
Venkata Reddy Konasani
 
Back propagation
Back propagationBack propagation
Back propagation
Nagarajan
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Decision Trees
Decision TreesDecision Trees
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
Ahmed_hashmi
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksstellajoseph
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 

Viewers also liked (18)

Classification Using Decision tree
Classification Using Decision treeClassification Using Decision tree
Classification Using Decision tree
 
Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health Data
 
Classification ANN
Classification ANNClassification ANN
Classification ANN
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Customer Centric Data Mining
Customer Centric Data MiningCustomer Centric Data Mining
Customer Centric Data Mining
 
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
Knowledge Discovery from Academic Data using Association Rule Mining, Paper P...
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Artificial Neural Network
Artificial Neural Network Artificial Neural Network
Artificial Neural Network
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
 
Artificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural NetworksArtificial Intelligence: Artificial Neural Networks
Artificial Intelligence: Artificial Neural Networks
 
Decision tree
Decision treeDecision tree
Decision tree
 
Back propagation
Back propagationBack propagation
Back propagation
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 

Similar to Two-step Classification method for Spatial Decision Tree

DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
nandhini manoharan
 
Developing a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGISDeveloping a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGIS
COGS Presentations
 
Computer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectComputer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an Object
IOSR Journals
 
Virtual Enterprise Model
Virtual Enterprise ModelVirtual Enterprise Model
Virtual Enterprise Model
Surendra Kancherla
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
malathieswaran29
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
ChaitanyaKulkarni451137
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
vikassingh569137
 
47 292-298
47 292-29847 292-298
47 292-298
idescitation
 
Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...
National Institute of Informatics
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
sandeepsandy494692
 
DM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year studentsDM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year students
sriharipatilin
 
Automated features extraction from satellite images.
Automated features extraction from satellite images.Automated features extraction from satellite images.
Automated features extraction from satellite images.
HimanshuGupta1081
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
DrGnaneswariG
 
Survey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real ImagesSurvey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real Images
CSCJournals
 
Object Oriented Programming_Lecture 2
Object Oriented Programming_Lecture 2Object Oriented Programming_Lecture 2
Object Oriented Programming_Lecture 2
Mahmoud Alfarra
 
Unit3
Unit3Unit3
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at Southampton
Jonathon Hare
 
I1803026164
I1803026164I1803026164
I1803026164
IOSR Journals
 
advancedR.pdf
advancedR.pdfadvancedR.pdf
advancedR.pdf
RukhsanaTaj2
 

Similar to Two-step Classification method for Spatial Decision Tree (20)

DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Developing a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGISDeveloping a Tutorial for Grouping Analysis in ArcGIS
Developing a Tutorial for Grouping Analysis in ArcGIS
 
Computer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an ObjectComputer Vision: Visual Extent of an Object
Computer Vision: Visual Extent of an Object
 
Virtual Enterprise Model
Virtual Enterprise ModelVirtual Enterprise Model
Virtual Enterprise Model
 
Data mining techniques unit v
Data mining techniques unit vData mining techniques unit v
Data mining techniques unit v
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
47 292-298
47 292-29847 292-298
47 292-298
 
Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...
 
UNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptxUNIT_V_Cluster Analysis.pptx
UNIT_V_Cluster Analysis.pptx
 
DM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year studentsDM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year students
 
Automated features extraction from satellite images.
Automated features extraction from satellite images.Automated features extraction from satellite images.
Automated features extraction from satellite images.
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
lab report 4
lab report 4lab report 4
lab report 4
 
Survey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real ImagesSurvey of The Problem of Object Detection In Real Images
Survey of The Problem of Object Detection In Real Images
 
Object Oriented Programming_Lecture 2
Object Oriented Programming_Lecture 2Object Oriented Programming_Lecture 2
Object Oriented Programming_Lecture 2
 
Unit3
Unit3Unit3
Unit3
 
Searching Images: Recent research at Southampton
Searching Images: Recent research at SouthamptonSearching Images: Recent research at Southampton
Searching Images: Recent research at Southampton
 
I1803026164
I1803026164I1803026164
I1803026164
 
advancedR.pdf
advancedR.pdfadvancedR.pdf
advancedR.pdf
 

Recently uploaded

How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
Fundacja Rozwoju Społeczeństwa Przedsiębiorczego
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
Vivekanand Anglo Vedic Academy
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
PedroFerreira53928
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 

Recently uploaded (20)

How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdfESC Beyond Borders _From EU to You_ InfoPack general.pdf
ESC Beyond Borders _From EU to You_ InfoPack general.pdf
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 

Two-step Classification method for Spatial Decision Tree

  • 1. An Efficient Two-Step Method for Classification of Spatial Data Authors : Krzysztof Koperski, Jiawei Han, Nebojsa Stefanovic Presented on : Spatial Data Handling (SDH’ 98) Reviewed by: Abhishek Agrawal
  • 2. Introduction • In spatial databases very large amounts of Spatial Data have been collected used in various applications ranging from remote sensing to geographical information systems (GIS), computer cartography, environmental assessment and planning etc. • These spatial databases contains many hidden and interesting implicit spatial relations and patterns which are extracted which are not explicitly stored in such databases. • One of the spatial data mining techniques is the classification of the spatial objects stored in the spatial databases where the objective is to label different spatial objects by identifying set of rules that can describe the partition.
  • 3. Classification Approach : Spatial Decision Tree ❖ In this paper[1], authors have used decision tree to classify spatial objects based on ➢ Non-Spatial properties of the classified objects (Traditional) ➢ Spatial relations of the classified objects to other objects in the database ❖ Also, authors have analyzed the problem of classification of spatial objects in relevance to thematic maps and and spatial relationships to other objects in the database. ❖ With the new approach of spatial classification using decision tree, authors provided the experimental results of both real and synthetic data to compare the performance and quality of the results with other existing methods in the same problem space.
  • 4. Business Problem: Label the local business units such as shopping malls or stores based on their business profit status based on the influence of their trade area. Problem Definition
  • 5. Problem Definition Continue.. Data Mining Problem: Classification of spatial objects such as shopping malls or stores defined by its attributes, that belong to two or different classes Y and N which are selected based on attribute high_profit with two values Y for “yes” and N for “no”.
  • 6. ● In our example, objects OID1 and OID2 belong to class Y and objects OID3, OID4 and OID5 belong to class N.
  • 7. ● In our example, objects OID1 and OID2 belong to class Y and objects OID3, OID4 and OID5 belong to class N. ● We want to build a decision tree classifying objects Oi based on two types of information: ➢ descriptions of the objects in the proximity of objects Oi
  • 8. ● In our example, objects OID1 and OID2 belong to class Y and objects OID3, OID4 and OID5 belong to class N. ● We want to build a decision tree classifying objects Oi based on two types of information: ➢ descriptions of the objects in the proximity of objects Oi ➢ non-spatial attributes of the thematic map
  • 9. State of the Art ● Fayyad et. al.[2] used decision tree methods to classify images of stellar objects to detect stars and galaxies. They used low-level image processing system FOCAS to select and generate basic attributes. The proposed method deals with image databases and is tailored for the astronomical attributes which is not suitable for vector data format (GIS Database) . ● Another approach, Ester et. al.[3], based on ID3 algorithm and uses the concept of neighbourhood graphs. This method doesn’t analyze aggregate values of non-spatial attributes for the neighbouring objects. Similarly it doesn’t perform any relevance analysis for narrowing its search space. ● Ng and Yu[4] described a method for the extraction of strong, common and discriminating characteristics of clusters based on the thematic map. They have not extended the result characteristics of thematic map to construct decision trees.
  • 10. Classification Algorithm Building a decision tree to classify spatial object based on spatial predicates, functions and thematic maps. Input : 1. Spatial Database containing: a. classified objects Oc b. other spatial objects with non-spatial attributes 2. Geo-mining query specifying: a. objects to be used, predictive attributes, predicates and functions b. attribute, predicate or function used as a class label Output : Binary Decision Tree
  • 11. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. b. Perform generalization of the sets of predicates based on concept hierarchies c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm 3. Find the best size for the buffer for aggregates of thematic map polygons. It is done by finding for all relevant non-spatial aggregate attributes the size of the buffer Xmax where the information gain for the aggregated attribute is maximum. 4. Build sets of predicates using relevant fine predicates and generalize based on concept hierarchies. 5. Generate Decision Tree
  • 12. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description
  • 13. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. Step1.a : Define MBR(Minimum Bounding Rect.) using data distribution and confidence level as threshold.
  • 14. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. Step 2.a : Find coarse description for the sample to list the spatial attributes, functions etc.
  • 15. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. b. Perform generalization of the sets of predicates based on concept hierarchies Step 2.b : Generalize the predicates using concept hierarchies
  • 16. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. b. Perform generalization of the sets of predicates based on concept hierarchies c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm RELIEF ALGORITHM Find Relevant Attributes Step 2.c : For every object s in the sample two nearest neighbours are found, where one neighbour belongs to the same class(Y/N) as object s (nearest hit) and other neighbour belongs to a class different than s (nearest miss)
  • 17. 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. b. Perform generalization of the sets of predicates based on concept hierarchies c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm RELIEF ALGORITHM Find Relevant Attributes Step 2.c : Give weights to the predicate based on neighbourhood predicates: ➔ For nearest hit, if it has the same predicate value, then weight for this predicate increases ↑ ➔ For nearest hit, if it has the different predicate value, then weight for this predicate decreases ↓ ➔ For nearest miss, if it has the same predicate value, then weight for this predicate decreases ↓ ➔ For nearest miss, if it has the different predicate value, then weight for this predicate increases ↑ Now based on weight > threshold, we select the relevant predicates Method: Spatial Decision Tree
  • 18. Method: Spatial Decision Tree 1. Collect a set S of classified objects and other objects that are used for description 2. For the sample of spatial object Oc from S: a. Build sets of predicates describing all objects using coarse predicates, functions and attributes. b. Perform generalization of the sets of predicates based on concept hierarchies c. Find coarse predicates, functions, and relevant attributes using RELIEF algorithm 3. Find the best size for the buffer for aggregates of thematic map polygons. It is done by finding for all relevant non-spatial aggregate attributes the size of the buffer Xmax where the information gain for the aggregated attribute is maximum. 4. Build sets of predicates using relevant fine predicates and generalize based on concept hierarchies. 5. Generate Decision Tree
  • 19. Method: Spatial Decision Tree Step 3: Find the best size for the buffer for aggregates of thematic map polygons. • Now for the shape of the buffer, different criteria may be used. The buffers may be based on rings or customer penetration polygons. • The rings have some advantages: 1. ease of use, 2. no need to determine trade area based on customer data 1. easy comparison between sites
  • 20. Method: Spatial Decision Tree Step 3: Find the best size for the buffer for aggregates of thematic map polygons. • Buffers represents area that have an impact on class label attribute of classified objects. • The size of buffer is fixed by finding for all relevant non-spatial aggregate attributes, the size of the buffer Xmax where the information gain for the aggregated attribute is maximum.
  • 21. Method: Spatial Decision Tree Step 4 : Build sets of predicates using relevant fine predicates and generalize based on concept hierarchies.
  • 22. Method: Spatial Decision Tree Step 4 : Build sets of predicates using relevant fine predicates and generalize based on concept hierarchies.
  • 23. Method: Spatial Decision Tree Step 5 : Build Decision Tree
  • 24. Method: Spatial Decision Tree Step 5 : Build Decision Tree : Binary Split ( Based on Info gain )
  • 26. Results & Performance Evaluation • Experiments were performed on synthetic data merge with TIGER U.S. census data for washington state. • With real data, best results were found with threshold between 0 to 0.2 and accuracy drastically increased when relevance analysis was used.
  • 27. Conclusion and Future Directions • Classification of geographical objects enables researcher to explore interesting relations between spatial and non-spatial data. • The algorithm performs less costly, approximate spatial computations, relevance analyses for producing smaller and more accurate decision trees. • The pre-computed spatial indexes can be stored as part of regular spatial query to find neighbourhood attributes. • Authors plan to perform experiments using aggregate values for thematic maps and by varying distance for close_to spatial predicates. • Integrate with their spatial data mining prototype GeoMiner
  • 28. References [1] Koperski, Krzysztof, Jiawei Han, and Nebojsa Stefanovic. "An efficient two-step method for classification of spatial data." proceedings of International Symposium on Spatial Data Handling (SDH’98). 1998. [2] Fayyad, Usama M., S. George Djorgovski, and Nicholas Weir. "Automating the analysis and cataloging of sky surveys." Advances in knowledge discovery and data mining. American Association for Artificial Intelligence, 1996. [3] Ester, Martin, Hans-Peter Kriegel, and Jörg Sander. "Spatial data mining: A database approach." Advances in spatial databases. Springer Berlin Heidelberg, 1997. [4] Ng, R. T., and Y. Yu Discovering Strong. "Common and Discriminating Characteristics of Clusters from Thematic Maps." Proc. of the 11th Annual Symp. on Geographic Information Systems. 1997.