Using formal ontology for integrated spatial data mining


Published on

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Different notion of knowledge
  • How should these considerations be given to those factors (domain-specific; task-oriented)?
  • Ontology can provide the systematic way of organizing those factors Spatial data mining system driven by formal ontology (as an active component of information system)
  • Usable clusters
  • Natural clusters
  • Using formal ontology for integrated spatial data mining

    1. 1. Using formal ontology for integrated spatial data mining Julie Sungsoon Hwang Department of Geography State University of New York at Buffalo ICCSA04 Perugia, Italy May 14, 2004
    2. 2. Research purposes <ul><li>Enlighten the role of formal ontology in KDD </li></ul><ul><li>Propose the conceptual framework for ontology-based spatial data mining </li></ul><ul><li>Case study: ontology-based spatial clustering algorithms </li></ul>
    3. 3. Problems in focus (cont.) <ul><li>No single algorithm is best suited to all research purposes and application domains. </li></ul><ul><ul><li>The same algorithm can yield results inconsistent with fact without considering domain knowledge </li></ul></ul><ul><ul><li>The same data may have to be analyzed in different ways depending on users’ goal </li></ul></ul>
    4. 4. Problems in focus <ul><li>Developing new algorithms </li></ul>Algorithm D Algorithm C Algorithm A Algorithm B Algorithm D’ How can algorithms be customized to varying domain and task? Domain Task <ul><li>Re-using existing algorithms </li></ul>Suited to domain and task
    5. 5. Relation between data mining and ontology construction Knowledge Ontology Ontology Construction (Knowledge acquisition) Level of abstraction Data Information Data Mining (Knowledge discovery) Knowledge
    6. 6. Role of formal ontology in KDD <ul><li>Provide the context in which the knowledge extracted from data is interpreted and evaluated </li></ul>KDD Process Diagram <ul><li>Guide algorithms such that they can be suitable for domain-specific and task-oriented concepts </li></ul>
    7. 7. Using ontology for spatial data mining <ul><li>Ontology formalizes how the knowledge is conceptualized, thereby making implicit meaning explicit </li></ul><ul><li>Data mining extracts a high-level knowledge from a low-level data, thereby enhancing the level of understanding </li></ul>Domain Model Task Model Ontology Spatial Data Mining Low-level data High-level knowledge
    8. 8. Domain-specific spatial data mining <ul><li>Let’s compare two different domains: traffic accident versus retailers </li></ul>Event Physical object In road network Outside of road network Spatial data mining algorithms should take into account different conceptualization (domain-specific properties) Spatial constraints Is-a Domain of retailers Domain of traffic accident
    9. 9. Task-oriented spatial data mining <ul><li>Let’s compare two different tasks: detecting hotspots of traffic accident versus partitioning market areas based on the location of retail </li></ul>Spatial data mining algorithms should take into account different tasks and users’ need Depend on spatial distributn. Given (resource constraint) Varies with scale (depends on area of users’ interest) Doesn’t vary with scale Level of details # of clusters k Partition market areas to a retailer Detect hotspots of traffic accident
    10. 10. Ontology as an active component of information system e.g. medicine e.g. diagnosing e.g. space, time, matter, object, event Application Ontology Task Ontology Domain Ontology Top-level Ontology dependence subject From Guarino, 1998
    11. 11. Conceptual framework for ontology-based spatial data mining (OBSDM)
    12. 12. Component of OBSDM
    13. 13. OBSDM:: Input:: Metadata <ul><li>Tag structure of XML can be utilized to inform domain ontology of the semantics of data </li></ul>
    14. 14. Component of OBSDM
    15. 15. OBSDM:: OBSDMM:: Domain Ont. <ul><li>Terms within the “theme” tag in the metadata are used as a token to locate the appropriate domain ontology </li></ul><ul><li>Domain ontology specifies the definition, class, and properties </li></ul><ul><ul><li>Class example: Accident is a Subclass-Of Temporal-Thing </li></ul></ul><ul><ul><li>Properties example: Road has a Geographic-Region as a Value-Type </li></ul></ul><ul><li>Properties of class inherit from top-level ontology </li></ul>
    16. 16. Domain ontology := Traffic accident <ul><li>Theory TRAFFIC-ACCIDENT-DOMAIN </li></ul><ul><li>As a spatial thing, </li></ul><ul><ul><li>Point(x)  On(x, y)  Roadway(y) </li></ul></ul><ul><ul><li>Line(y)  In(y, z)  Geographic-Region(z) </li></ul></ul><ul><li>As a temporal thing, </li></ul><ul><ul><li>Point(x)  At(x, y)  Time(y) </li></ul></ul><ul><ul><li>Event(x) <=> Occurrence(x)  Notification(x)  Response(x)  Arrival(x) </li></ul></ul><ul><ul><li>Before(Occurrence(x), Notification(x)) </li></ul></ul><ul><li>As an intangible thing, </li></ul><ul><ul><li>Accident (x)  RelatedTo(x, y)  Vehicle(y) </li></ul></ul>
    17. 17. Component of OBSDM
    18. 18. OBSDM:: Input:: User Interface <ul><li>Users can specify a goal, level of detail, and geographic area of interest through UI </li></ul>
    19. 19. Component of OBSDM
    20. 20. OBSDM:: OBSDMM:: Task Ont. <ul><li>The inputs specified by users in the user interface are translated into task ontology </li></ul><ul><li>Task ontology explicitly specify goal, methods, requirements, and constraint </li></ul>
    21. 21. Task ontology := Spatial clustering <ul><li>Theory SPATIAL-CLUSTERING-TASK </li></ul><ul><li>Documentation: </li></ul><ul><ul><li>This theory defines a task ontology for the spatial clustering task. The spatial clustering task, which is a class of clustering task, is a problem of grouping similar spatial objects into classes. </li></ul></ul><ul><li>Super classes: Clustering </li></ul><ul><li>Subclasses: </li></ul><ul><ul><li>Sub goal: </li></ul></ul><ul><ul><ul><li>“ Find hot spots” </li></ul></ul></ul><ul><ul><ul><li>“ Group similar patterns” </li></ul></ul></ul><ul><ul><ul><li>“ Partition into k -clusters” </li></ul></ul></ul><ul><ul><li>Requirement: </li></ul></ul><ul><ul><ul><li>Assignment-Object </li></ul></ul></ul><ul><ul><ul><ul><li>Source: Spatial Objects </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Target: Clusters </li></ul></ul></ul></ul><ul><ul><ul><li>Geographic-Scale </li></ul></ul></ul><ul><ul><ul><li>Detail-Level </li></ul></ul></ul><ul><ul><li>Constraint: </li></ul></ul><ul><ul><ul><li>Spatial Objects </li></ul></ul></ul><ul><ul><ul><li>Operational Constraints </li></ul></ul></ul>
    22. 22. Component of OBSDM
    23. 23. OBSDM:: OBSDMM:: Alg. Builder OBSDM:: Output:: GVis tool <ul><li>Algorithm builder puts together requirements for building the best algorithm suited to domain of data and users’ input (task). </li></ul><ul><li>Data content is filtered through domain ontology, and the users’ requirement is filtered through task ontology. </li></ul><ul><li>The geographic visualization tool displays results (pattern discovered) </li></ul>
    24. 24. Case study: ontology-based spatial clustering of traffic accidents OBSC Setting Metadata Theme := Traffic Accident User interface Goal := “identify hot spots” LevelOfDetail := State PlaceName := New York Method Algorithm := SMTIN Constraint := Named-Roadway Input: 353 features in Erie Output: 18 clusters in Erie County
    25. 25. Case study: Effect of scale (Task ontology) <ul><li>OBSC clusters reflect spatial distribution specific to the scale of users’ interest </li></ul>Control Algorithm OBSC Algorithm TASK LevelOfDetail := Null PlaceName := Null DOMAIN Constraint := Roadway TASK LevelOfDetail := County PlaceName := New York DOMAIN Constraint := Roadway Specifying area of interest doesn’t mask details
    26. 26. Case study: Effect of constraint (Domain ontology) <ul><li>OBSC clusters identify the physical barrier due to concept implicit in domain </li></ul>Control Algorithm OBSC Algorithm TASK LevelOfDetail := State PlaceName := New York DOMAIN Constraint := Null TASK LevelOfDetail := State PlaceName := New York DOMAIN Constraint := Roadway Separated by body of water
    27. 27. Case study: Benefit of using ontology in spatial clustering <ul><li>Incorporating ontology in spatial clustering algorithms enhances the quality of spatial clustering results </li></ul><ul><ul><li>Task ontology makes clusters usable </li></ul></ul><ul><ul><ul><li>Responsive to users’ view </li></ul></ul></ul><ul><ul><li>Domain ontology makes clusters natural </li></ul></ul><ul><ul><ul><li>Dictated by concept implicit in domain </li></ul></ul></ul>
    28. 28. Conclusion (cont.) <ul><li>Presents how ontology are incorporated in spatial data mining algorithms </li></ul><ul><ul><li>Semantic linkage between ontologies and algorithms through parameterization </li></ul></ul><ul><ul><ul><li>Scale as a task-oriented property </li></ul></ul></ul><ul><ul><ul><li>Constraint as a domain-specific property </li></ul></ul></ul>
    29. 29. Conclusion <ul><li>Ontology is examined as a means to customize algorithms to varying domain and task </li></ul><ul><ul><li>Ontology enables algorithms to reflect concepts implicit in domain, and adapt to users’ view </li></ul></ul><ul><ul><li>Ontology provides the semantically plausible way to re-use existing algorithms </li></ul></ul><ul><li>Ontology provides the systematic way of organizing various factors that dictate mechanisms underlying data mining process </li></ul>