A Geographic Knowledge Base for Semantic Web Applications Marcirio Silveira Chaves Mário J. Silva Bruno Martins 20º  Brazilian Symposium on Databases - SBBD 2005 Uberlândia - MG Linguateca www.linguateca.pt
Motivation/Context GKB - Geographic Knowledge Base Geographic Network Information exported as  ontologies  Geographic-aware  Semantic Web applications GREASE – Geographic Reasoning for Search Engines
Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
Information Sources used by GKB Geo-Administrative and Geo-Physical Domain Administrative Postal Gazetteers Wikipedia Network Domain FCCN  Web domains Web sites
Architecture of GKB
Feature concept in GKB A meaningful object in the selected domain of discourse [ISO19109]. Ex.: countries, cities and localities
Conceptual Design of GKB GKB meta-model
Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
Knowledge Integration in GKB GKB hierarchy from different information sources Algorithm: It searches  the lowest common features types  in both hierarchies If it holds, it identifies the  common instances between the hierarchies   Once the common instances are identified, it goes up the hierarchy and searches for the  lowest common ancestor   It verifies the distance (in number of relationships partOf) between the common instances of the features types and its ancestors. The ancestor, which has the small distance up to the common instances is  merged  through a relationship partOf with the ancestor in the another hierarchy. The existing relationships in both hierarchies are maintained.
Knowledge Integration in GKB GKB hierarchy from different information sources H1 Norte Grande  Porto Tâmega Matosinhos Vila  Nova  de Gaia Penafiel NUT2 NUT3 MUNICIPALITY MUNICIPALITY H2 Porto Matosinhos Vila  Nova  de Gaia Penafiel DISTRITO
Knowledge Integration in GKB GKB hierarchy from different information sources H1 Norte Grande  Porto Tâmega Matosinhos Vila  Nova  de Gaia Penafiel NUT2 NUT3 MUNICIPALITY MUNICIPALITY H2 Porto Matosinhos Vila  Nova  de Gaia Penafiel DISTRITO
Knowledge Integration in GKB GKB hierarchy from different information sources H1 Norte Grande  Porto Tâmega Matosinhos Vila  Nova  de Gaia Penafiel NUT2 NUT3 MUNICIPALITY MUNICIPALITY H2 Porto Matosinhos Vila  Nova  de Gaia Penafiel DISTRITO
Knowledge Integration in GKB Merged Hierarchy Norte Grande  Porto Porto Tâmega Penafiel Matosinhos Vila  Nova  de Gaia
Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
Using Geographic Knowledge in GKB Geographic scopes www.cm-lisboa.pt Lisboa (municipality) Rules New relationships and knowledge Description Logics (DLs) Geo domain Names composed of multiple words are represented in different ways Network domain Names of URLs are decomposed by the correspondent domain division
ABox in DLs for the: municipality of  Santiago do Cacém geoFeatureName (270,“santiago do cacem”) geoFeatureName (270,“santiag oc acem”). geoFeatureName (270,“santiago -do- cacem”). geoFeatureName (270,“santiag o-c acem”). geoFeatureType (270,“CON”). web site:  www.cm- santiago-do-cacem.pt netSiteSubDomain (33684,“www”). netSitePrefix (33684,“cm”). netSiteDomainToken (33684,“santiago-do-cacem”). netSiteTLD (33684,“pt”). Using Geographic Knowledge in GKB
Terminology Description (TBox in DLs) Municipalities hasScope(idN,idG)      netSiteDomainToken (idN,X)   ((  netSitePrefix (idN,“cm”)      netSitePrefix (idN,“mun”))    geoFeatureType (idG,“CON”)    geoFeatureName (idG,X). Using Geographic Knowledge in GKB
Ex.: hasScope(idN,idG)      netSiteDomainToken (idN,X)   (  netSitePrefix (idN,“cm”)      netSitePrefix (idN,“mun”))    geoFeatureType (idG,“CON”)      geoFeatureName (idG,X). netSiteDomainToken (33684, “santiago-do-cacem”). netSitePrefix (33684, “cm”). geoFeatureType (270, “CON”). geoFeatureName (270, “santiago -do- cacem”). New knowledge: hasScope( 33684 ,  270 ). Using Geographic Knowledge in GKB
Rule-based  assigned scopes by GKB to  sites of Portugal Using Geographic Knowledge in GKB Scopes  extended to the web pages  under each one of the sites of matching subdomains 105 (26%) 402 high schools 55 (36%) 152 training centers 124 (6%) 1955 basic schools  124 (41%) 300  freguesias 261 (90%) 288 municipalities 17 (52%) 33 distritos   # of matches # of sites  Site Type
Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
GKB as an Ontology <gn:Geo_Feature rdf:ID=&quot; GEO_238 &quot;> <gn:geo_id> 238 </gn:geo_id> <gn:geo_name xml:lang=&quot;pt&quot;> Porto </gn:geo_name> <gn:geo_type_id rdf:resource=&quot; #CON &quot;/> <gn:info_source_id rdf:resource=&quot; #INE &quot;/> <gn:related_to> <rdf:Bag> <rdf:li> <gn:Geo_Relationship> <gn:rel_type_id rdf:resource=&quot; #PRT &quot;/> <gn:geo_id><rdf:Bag> <rdf:li rdf:resource=&quot; #GEO_130 &quot;/> <rdf:li rdf:resource=&quot; #GEO_3967 &quot;/> </rdf:Bag></gn:geo_id> </gn:Geo_Relationship> </rdf:li>  <rdf:li> <gn:Geo_Relationship> <gn:rel_type_id rdf:resource=&quot; #ADJ &quot;/> <gn:geo_id> <rdf:Bag>   <rdf:li rdf:resource=&quot; #GEO_127 &quot;/> <rdf:li rdf:resource=&quot; #GEO_156 &quot;/> <rdf:li rdf:resource=&quot; #GEO_162 &quot;/> <rdf:li rdf:resource=&quot; #GEO_331 &quot;/> </rdf:Bag> </gn:geo_id> </gn:Geo_Relationship> </rdf:li> </rdf:Bag> </gn:related_to> <gn:population> 263131 </gn:population> </gn:Geo_Feature> Geo-Net-PT01
Statistics of the Ontologies Created 12,291 (99,99%) 417,739 (99.92%)  # of features without adjacent 11,819 (96,14%) 417,867 (99.95%)  # of features without equivalent  12,045 (97,98%) 374,349  (89.54%) # of features without descendants 1(0.00%) 3 (0.00%)  # of features without ancestors  6.5 3.54 Avg. adjacent features per feature with adjacent  3.82 1.99 Avg. equivalent features per feature with equivalent 475.44 10.56 Avg. narrower features per feature 1.07 1.0016 Avg. broader features per feature 13 (0.10%) 1,132 (0.27%) # of adjacency relationships 2,501(20,40%) 395 (0.09%) # of equivalence relationships 12,245 (99,89%) 418,340 (99.83%) # of part-of relationships  12,258 419,867 # of relationships  12,293 418,065   # of features World Portugal Statistic
Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
Applications using GKB NERC tool for recognizing  geographical   references  in text Classification tool for assigning  documents  to a corresponding geographical scope Information retrieval interface for geographical queries
Applications using GKB
Final Remarks A  domain-independent model  for storing geographic and network knowledge Sharing of the collected knowledge as  formal ontologies Geo-Net-PT01 : The first public geographic ontology of Portugal -  http://xldb.fc.ul.pt/geonetpt Future work Augmenting the knowledge in GKB with geographic entities extracted from the texts of the Portuguese Web

Simpósio Brasileiro de Banco de Dados 2005

  • 1.
    A Geographic KnowledgeBase for Semantic Web Applications Marcirio Silveira Chaves Mário J. Silva Bruno Martins 20º Brazilian Symposium on Databases - SBBD 2005 Uberlândia - MG Linguateca www.linguateca.pt
  • 2.
    Motivation/Context GKB -Geographic Knowledge Base Geographic Network Information exported as ontologies Geographic-aware Semantic Web applications GREASE – Geographic Reasoning for Search Engines
  • 3.
    Presentation Structure ConceptualDesign of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
  • 4.
    Information Sources usedby GKB Geo-Administrative and Geo-Physical Domain Administrative Postal Gazetteers Wikipedia Network Domain FCCN Web domains Web sites
  • 5.
  • 6.
    Feature concept inGKB A meaningful object in the selected domain of discourse [ISO19109]. Ex.: countries, cities and localities
  • 7.
    Conceptual Design ofGKB GKB meta-model
  • 8.
    Presentation Structure ConceptualDesign of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
  • 9.
    Knowledge Integration inGKB GKB hierarchy from different information sources Algorithm: It searches the lowest common features types in both hierarchies If it holds, it identifies the common instances between the hierarchies Once the common instances are identified, it goes up the hierarchy and searches for the lowest common ancestor It verifies the distance (in number of relationships partOf) between the common instances of the features types and its ancestors. The ancestor, which has the small distance up to the common instances is merged through a relationship partOf with the ancestor in the another hierarchy. The existing relationships in both hierarchies are maintained.
  • 10.
    Knowledge Integration inGKB GKB hierarchy from different information sources H1 Norte Grande Porto Tâmega Matosinhos Vila Nova de Gaia Penafiel NUT2 NUT3 MUNICIPALITY MUNICIPALITY H2 Porto Matosinhos Vila Nova de Gaia Penafiel DISTRITO
  • 11.
    Knowledge Integration inGKB GKB hierarchy from different information sources H1 Norte Grande Porto Tâmega Matosinhos Vila Nova de Gaia Penafiel NUT2 NUT3 MUNICIPALITY MUNICIPALITY H2 Porto Matosinhos Vila Nova de Gaia Penafiel DISTRITO
  • 12.
    Knowledge Integration inGKB GKB hierarchy from different information sources H1 Norte Grande Porto Tâmega Matosinhos Vila Nova de Gaia Penafiel NUT2 NUT3 MUNICIPALITY MUNICIPALITY H2 Porto Matosinhos Vila Nova de Gaia Penafiel DISTRITO
  • 13.
    Knowledge Integration inGKB Merged Hierarchy Norte Grande Porto Porto Tâmega Penafiel Matosinhos Vila Nova de Gaia
  • 14.
    Presentation Structure ConceptualDesign of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
  • 15.
    Using Geographic Knowledgein GKB Geographic scopes www.cm-lisboa.pt Lisboa (municipality) Rules New relationships and knowledge Description Logics (DLs) Geo domain Names composed of multiple words are represented in different ways Network domain Names of URLs are decomposed by the correspondent domain division
  • 16.
    ABox in DLsfor the: municipality of Santiago do Cacém geoFeatureName (270,“santiago do cacem”) geoFeatureName (270,“santiag oc acem”). geoFeatureName (270,“santiago -do- cacem”). geoFeatureName (270,“santiag o-c acem”). geoFeatureType (270,“CON”). web site: www.cm- santiago-do-cacem.pt netSiteSubDomain (33684,“www”). netSitePrefix (33684,“cm”). netSiteDomainToken (33684,“santiago-do-cacem”). netSiteTLD (33684,“pt”). Using Geographic Knowledge in GKB
  • 17.
    Terminology Description (TBoxin DLs) Municipalities hasScope(idN,idG)   netSiteDomainToken (idN,X)  ((  netSitePrefix (idN,“cm”)   netSitePrefix (idN,“mun”))   geoFeatureType (idG,“CON”)   geoFeatureName (idG,X). Using Geographic Knowledge in GKB
  • 18.
    Ex.: hasScope(idN,idG)   netSiteDomainToken (idN,X)  (  netSitePrefix (idN,“cm”)   netSitePrefix (idN,“mun”))   geoFeatureType (idG,“CON”)   geoFeatureName (idG,X). netSiteDomainToken (33684, “santiago-do-cacem”). netSitePrefix (33684, “cm”). geoFeatureType (270, “CON”). geoFeatureName (270, “santiago -do- cacem”). New knowledge: hasScope( 33684 , 270 ). Using Geographic Knowledge in GKB
  • 19.
    Rule-based assignedscopes by GKB to sites of Portugal Using Geographic Knowledge in GKB Scopes extended to the web pages under each one of the sites of matching subdomains 105 (26%) 402 high schools 55 (36%) 152 training centers 124 (6%) 1955 basic schools 124 (41%) 300 freguesias 261 (90%) 288 municipalities 17 (52%) 33 distritos # of matches # of sites Site Type
  • 20.
    Presentation Structure ConceptualDesign of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
  • 21.
    GKB as anOntology <gn:Geo_Feature rdf:ID=&quot; GEO_238 &quot;> <gn:geo_id> 238 </gn:geo_id> <gn:geo_name xml:lang=&quot;pt&quot;> Porto </gn:geo_name> <gn:geo_type_id rdf:resource=&quot; #CON &quot;/> <gn:info_source_id rdf:resource=&quot; #INE &quot;/> <gn:related_to> <rdf:Bag> <rdf:li> <gn:Geo_Relationship> <gn:rel_type_id rdf:resource=&quot; #PRT &quot;/> <gn:geo_id><rdf:Bag> <rdf:li rdf:resource=&quot; #GEO_130 &quot;/> <rdf:li rdf:resource=&quot; #GEO_3967 &quot;/> </rdf:Bag></gn:geo_id> </gn:Geo_Relationship> </rdf:li> <rdf:li> <gn:Geo_Relationship> <gn:rel_type_id rdf:resource=&quot; #ADJ &quot;/> <gn:geo_id> <rdf:Bag> <rdf:li rdf:resource=&quot; #GEO_127 &quot;/> <rdf:li rdf:resource=&quot; #GEO_156 &quot;/> <rdf:li rdf:resource=&quot; #GEO_162 &quot;/> <rdf:li rdf:resource=&quot; #GEO_331 &quot;/> </rdf:Bag> </gn:geo_id> </gn:Geo_Relationship> </rdf:li> </rdf:Bag> </gn:related_to> <gn:population> 263131 </gn:population> </gn:Geo_Feature> Geo-Net-PT01
  • 22.
    Statistics of theOntologies Created 12,291 (99,99%) 417,739 (99.92%) # of features without adjacent 11,819 (96,14%) 417,867 (99.95%) # of features without equivalent 12,045 (97,98%) 374,349 (89.54%) # of features without descendants 1(0.00%) 3 (0.00%) # of features without ancestors 6.5 3.54 Avg. adjacent features per feature with adjacent 3.82 1.99 Avg. equivalent features per feature with equivalent 475.44 10.56 Avg. narrower features per feature 1.07 1.0016 Avg. broader features per feature 13 (0.10%) 1,132 (0.27%) # of adjacency relationships 2,501(20,40%) 395 (0.09%) # of equivalence relationships 12,245 (99,89%) 418,340 (99.83%) # of part-of relationships 12,258 419,867 # of relationships 12,293 418,065 # of features World Portugal Statistic
  • 23.
    Presentation Structure ConceptualDesign of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks
  • 24.
    Applications using GKBNERC tool for recognizing geographical references in text Classification tool for assigning documents to a corresponding geographical scope Information retrieval interface for geographical queries
  • 25.
  • 26.
    Final Remarks A domain-independent model for storing geographic and network knowledge Sharing of the collected knowledge as formal ontologies Geo-Net-PT01 : The first public geographic ontology of Portugal - http://xldb.fc.ul.pt/geonetpt Future work Augmenting the knowledge in GKB with geographic entities extracted from the texts of the Portuguese Web