Spatial Data Mining for Customer
                 Segmentation




            Data Mining in Practice Seminar,
          ...
Introduction: a classic example for
                 spatial analysis
                       Disease cluster              ...
Good representation because...

 Represents spatial relation of objects
 of the same type


Represents spatial relation of...
Goals of Spatial Data Mining


• Identifying spatial patterns
• Identifying spatial objects that are
  potential generator...
Approach to Spatial Knowledge
                   Discovery



                                                     n
     ...
UK, Greater Manchester, Stockport


                                             Streets
     Buildings
                  ...
Representation of spatial data
                in Oracle Spatial
 A set of relations R1,...,Rn such that each relation Ri ...
Stockport Database Schema
                                   Shopping                                     TAB01
          ...
Spatial Predicates in Oracle Spatial

 Topological relation (Egenhofer 1991)
 A disjoint B, B disjoint A
 A meets B, B mee...
Typical Data Mining representation
‘spreadsheet data’
                                                          exactly 1 ...
SPIN! – The Elements
                                             n
                                                      ...
1. Spatial Data Mining Platform




Spatial Data Mining, Michael May, Fraunhofer AIS   12
Providing an integrated data mining
                platform

• Data access to heterogeneous and distributed data
  source...
SPIN! Architecture: Enterprise Java
            Bean-based
                                         Java Swing based Clien...
SPIN! User Interface


                                                              Point & Click-
                      ...
2. Visual Exploratory Analyis




Spatial Data Mining, Michael May, Fraunhofer AIS   16
Interactive Exploratory Analysis
                                                        Parallel Coordinate Plot
   Choro...
3. Searching for Explanatory
                   Patterns




Spatial Data Mining, Michael May, Fraunhofer AIS   18
Data Mining Tasks in SPIN!

• Looking for associations between subsets of
  spatial and non-spatial attributes
  ð Spatial...
Subgroup Discovery Search

• Subgroup discovery is a multi-relational approach that
  searches for probabilistically defin...
Division of labour between Oracle RDBMS
              and Search Manager

                                        mining q...
Data Mining visualization
                                                            High long-term illness in
          ...
Customer Analysis Rodgau, Germany




Spatial Data Mining, Michael May, Fraunhofer AIS   23
System Demo:
                   Customer Analysis
                         using
                  MiningMart and SPIN!


...
Summary & Outlook

• SPIN! tightly integrates Data Mining analysis and GIS-based
  visualization
• Main features:
    – A ...
Upcoming SlideShare
Loading in …5
×

Spatial Data Mining for Customer Segmentation

1,147 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,147
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
27
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Spatial Data Mining for Customer Segmentation

  1. 1. Spatial Data Mining for Customer Segmentation Data Mining in Practice Seminar, Dortmund, 2003 Dr. Michael May Fraunhofer Institut Autonome Intelligente Systeme Spatial Data Mining, Michael May, Fraunhofer AIS 1
  2. 2. Introduction: a classic example for spatial analysis Disease cluster Dr. John Snow Deaths of cholera epidemia London, September 1854 Infected water pump? A good representation is the key to solving a problem Spatial Data Mining, Michael May, Fraunhofer AIS 2
  3. 3. Good representation because... Represents spatial relation of objects of the same type Represents spatial relation of objects to other objects Shows only relevant aspects and It is not only hides irrelevant important where a cluster is but also, what else is there (e.g. a water-pump)! Spatial Data Mining, Michael May, Fraunhofer AIS 3
  4. 4. Goals of Spatial Data Mining • Identifying spatial patterns • Identifying spatial objects that are potential generators of patterns • Identifying information relevant for explaining the spatial pattern (and hiding irrelevant information) • Presenting the information in a way that is intuitive and supports further analysis Spatial Data Mining, Michael May, Fraunhofer AIS 4
  5. 5. Approach to Spatial Knowledge Discovery n ( p − p0 ) Data Mining p0 ⋅ (1 − p0 ) + Geographic Information Systems = SPIN! Spatial Data Mining, Michael May, Fraunhofer AIS 5
  6. 6. UK, Greater Manchester, Stockport Streets Buildings Person p. Household No. of Cars Long-term illness Age Rivers Profession Ethnic group Hospitals Unemployment Education Migrants Medical establishment Shopping areas ... 6 Spatial Data Mining, Michael May, Fraunhofer AIS
  7. 7. Representation of spatial data in Oracle Spatial A set of relations R1,...,Rn such that each relation Ri has a geometry attribute Gi or an identifier Ai such that Ri can be linked (joined) to a relation Rk having a geometry attribute Gk – Geometry attributes Gi consist of ordered sets of x,y-pairs defining points, lines, or polygons – Different types of spatial objects are organized in different relations Ri (geographic layers), e.g. streets, rivers, enumeration districts, buidlings, and – each layer can have its own set of attributes A1,..., An and at most one geometry attribute G Spatial Data Mining, Michael May, Fraunhofer AIS 7
  8. 8. Stockport Database Schema Shopping TAB01 Region Attribute data Water 95 tables with =zone_id census data, spatially spatially interacts ~8000 attributes interacts ... ED River =zone_id TAB61 spatially interacts Spatial Hierarchy inside =zone_id • County Building ... spatially spatially • District interact interact TAB95 • Wards Street Geographical Vegetation • Enumeration district Layers 85 tables Spatial Data Mining, Michael May, Fraunhofer AIS 8
  9. 9. Spatial Predicates in Oracle Spatial Topological relation (Egenhofer 1991) A disjoint B, B disjoint A A meets B, B meets A A overlaps B, B overlaps A A equals B, B equals A A covers B, B covered by A A covered-by B, B covers A A contains B, B inside A A inside B, B contains A Distance relation: Minimum distance between 2 points Spatial Data Mining, Michael May, Fraunhofer AIS 9
  10. 10. Typical Data Mining representation ‘spreadsheet data’ exactly 1 table atomic values Data Mining for spatial data: strong discrepancy between usual and adequate problem representation Spatial Data Mining, Michael May, Fraunhofer AIS 10
  11. 11. SPIN! – The Elements n ( p − p0 ) p0 ⋅ (1 − p0 ) Spatial Data Mining, Michael May, Fraunhofer AIS 11
  12. 12. 1. Spatial Data Mining Platform Spatial Data Mining, Michael May, Fraunhofer AIS 12
  13. 13. Providing an integrated data mining platform • Data access to heterogeneous and distributed data sources (Oracle RDBMS, flat file, spatial data) • Organizing and documenting analysis tasks • Launching analysis tasks • Visualizing results Note: Same software basis as MiningMart! Spatial Data Mining, Michael May, Fraunhofer AIS 13
  14. 14. SPIN! Architecture: Enterprise Java Bean-based Java Swing based Client Client Workspace JDBC (Connections) Visual Algorithm RMI/IIOP (References) Component Component JBoss application server Data Algorithm Client Workspace Persistent Session Entity Entity object Bean Bean Bean Database Enterprise Java Bean Container Database Object-relational spatial database (Oracle9i) Spatial Data Mining, Michael May, Fraunhofer AIS 14
  15. 15. SPIN! User Interface Point & Click- Tool for defining analysis tasks Workspace Tree Property editor Spatial Data Mining, Michael May, Fraunhofer AIS 15
  16. 16. 2. Visual Exploratory Analyis Spatial Data Mining, Michael May, Fraunhofer AIS 16
  17. 17. Interactive Exploratory Analysis Parallel Coordinate Plot Choropleth maps showing distribution of variable(s) in space Combining spatial and non-spatial displays Variables selected and manipulated by the user Powerful for low- dimensional dependencies (3-4) Displays dynamically linked Scatter Plot Spatial Data Mining, Michael May, Fraunhofer AIS 17
  18. 18. 3. Searching for Explanatory Patterns Spatial Data Mining, Michael May, Fraunhofer AIS 18
  19. 19. Data Mining Tasks in SPIN! • Looking for associations between subsets of spatial and non-spatial attributes ð Spatial Association Rules • A phenomenon of interest (e.g. death rate) is given but it is not clear which of a large number of spatial and non-spatial attributes is relevant for explaining it ð Spatial Subgroup Discovery • A quantitative variable of interest is given and we ask how much this variable changes when one of the relevant independent variables is changed ð Bayesian Local regression Spatial Data Mining, Michael May, Fraunhofer AIS 19
  20. 20. Subgroup Discovery Search • Subgroup discovery is a multi-relational approach that searches for probabilistically defined deviation patterns (Klösgen 1996, Wrobel 1997) • Top-down search search from most general to most specific subgroups, exploiting partial ordering of subgroups (S1 ≥ S2 S1 more general than S2) • Beam search expanding only the n best ones at each level of search • Evaluating hypothesis according to quality function: T= target group C= concept p(T | C ) − p (T ) N n p (T )(1 − p (T )) N −n T = long-term illness=high C = unemployment=high Spatial Data Mining, Michael May, Fraunhofer AIS 20
  21. 21. Division of labour between Oracle RDBMS and Search Manager mining query Database Server Search Algorithm sufficient statistics Mining Server • Database integration: efficiently organize • search in hypothesis space mining queries • generation and evaluation of hypotheses • Mining query delivers statistics (aggregations) (subgroup patterns) sufficient for evaluating many hypotheses Spatial Data Mining, Michael May, Fraunhofer AIS 21
  22. 22. Data Mining visualization High long-term illness in districts crossed by M60 p(T|C) vs. p(C) Subgroup Overview Spatial Venn Diagram Subgroup Linked Display Spatial Data Mining, Michael May, Fraunhofer AIS 22
  23. 23. Customer Analysis Rodgau, Germany Spatial Data Mining, Michael May, Fraunhofer AIS 23
  24. 24. System Demo: Customer Analysis using MiningMart and SPIN! Spatial Data Mining, Michael May, Fraunhofer AIS 24
  25. 25. Summary & Outlook • SPIN! tightly integrates Data Mining analysis and GIS-based visualization • Main features: – A spatial data mining platform – New spatial data mining algortihms for subgroup discovery, association rules, Baysian MCMC – New visualization methods • Integration of Spatial Data allows to get results that could not be achieved otherwise • MiningMart can usefully applied for some pre-processing tasks • Future tasks: Integrating spatial preprocessing in MiningMart Spatial Data Mining, Michael May, Fraunhofer AIS 25

×