A Framework for Ontology Usage Analysis




Jamshaid Ashraf
jamshaid.ashraf@gmail.com

Supervisor : Dr Omar Hussain
School of Information Systems, Curtin University, Perth, Western Australia


PhD symposium
ESWC 2012, Heraklion, Crete, Greece (27- 31 May 2012)
Knowledge focused



[1999 – 2006]   - ONTOLOGY
                                                          Instance data




                             Ontologies




                                                Knowledge Focused
                                          •Ontology Languages
                                          •Ontology authoring tools
                                          •Reasoning
                                          •Ontology evaluation
                                          •Ontology evolution
(Structured) Data focused

                                 Ontologies

[2006 – to data] - LINKED DATA



                                                                   Linked Data




                                                        Data Focused
                                              •Linked Data principles
                                              •Linked Open Data project
                                              •LOD cloud
                                              •RDFa
                                              •RDF data analysis
Current state




            Ontol
                  og y




                                                      ata
                                           Li n ked d


       ….. searching less and using more
Increase in the use of ontologies




                                    21 May 2012
Lack of visibility


- Index such as PingTheSemanticWeb does not provide a detailed
  view of ontology usage

- In order to make effective and efficient use of semantic web
  data, we need to know which concepts and relationships and
  how are being used?

- An insight into the structure, understand the pattern available,
  actual use and the intended use
Ontology life cycle




                                      Ontology
     Ontology Dev. Lifecycle
     •Think
     •Design
     •Develop & evaluate
     •Deploy
     •Evangelize
     •Adoption!




     •   Measure and analyze
     •   Learn from it to influence
         future thinking and design
Evaluate, measure and analyse the use of
ontologies on the Web
Benefits of Usage Analysis

(1) Helps in providing usage-based feedback loop to the ontology
    maintenance process for a pragmatic conceptual model update
(2) Assist in building data rich interfaces, exploratory search and
    exploratory data analysis
(3) Provides erudite insight on the state of semantic structured data based
    on prevalent knowledge patterns for the consuming applications
Ontology Usage Analysis Framework (OUSAF)

Identification (selection of ontologies)
 - Domain Ontology
 - Identify candidate ontology(ies) from dataset

Investigation (analysing the use of ontology)
 - Usage/population/instantiation
 - Co-usability/schema-link graph

Representation (represent the usage analysis )
 - Conceptual model to represent ontology usage
 - Ontology Usage Catalogue

Utilization (making use of usage analysis )
 - Use case implementation
 - Publication of ontology usage analysis
Metrics for measuring richness


>Concept Richness (CR): Describes the relationship with other
concepts and the number of attributes to describe the
instances

>Relationship Value (RV): Reflects the possible role of an
object property in creating typed relationship between
different concepts

>Attribute Value (RV): Reflects the number of concepts that
have data properties used to provide values to instances
Metrics for measuring usage



>Concept Usage (CU): Measures the instantiation of the
concept in the knowledge base
              CU(C) = |{t = (s, p, o)| p = rdf:type, o = C}|1
              CUH(C) = |{t = (s, p, o)| p = rdf:type, o entailrdfs9(C)}|

>Relationship Usage (RU): Calculates the number of
triplets in a dataset in which object property is used to
create relationships between different concept’s instances
              RU(P) = | { t:=(s,p,o) | p= P} |

>Attribute Usage (RU): Measures how much data description
is available in the knowledge base for a concept instance
              AU(A) = | { t:=(s,p,o) | p A, o L) |
Structural properties


 Represent ontology usage as a bipartite network

 -Hidden properties in ontology usage network to identify
 cohesive groups and measure semanticity.

 -Study structural properties such as centrality, reciprocity,
 density and reachability


 Capture the knowledge patterns

 -Schema level patterns Hidden properties in ontology usage
 network to identify cohesive groups and measure
 semanticity.

 -Study structural properties such as centrality, reciprocity,
 density and reachability
Initial Results – domain ontology usage

 GR data coverage
Initial Results – use case


Web Schema construction based on Ontology Usage Analysis

Domain : eCommerce
Dataset : 305 data sources (pay-level domains published ecommerce data)


Ranking the terms
U Ontology

   Ontology Usage Ontology (U Ontology)
   Goal : Capture the detail of ontologies and their usage

   Use cases :
          - publish the ontology usage details on the web.
          - generate prototypical SPARQL queries



  Reusing existing ontologies
  -Ontology Metadata Vocabulary (OMV) [1]
  -Ontology Application Framework (OAF) [2]
  -FOAF, DC




 [1] Hartmann, J., Palma, R., Sure, Y., Suárez-Figueroa, M.C., Haase P.: OMV– Ontology Metadata Vocabulary. In: The
 Ontology Patterns for the Semantic Web (OPSW) Workshop at ISWC 2005, Galway, Ireland (2005)

 [2] http://ontolog.cim3.net/file/work/OntologySummit2011/ApplicationFramework/OWL-Ontology/BenefitsAndTechniques-
 WithDocumentation.pdf
Conclusion

                 What and how                                                                           Semantic Web data
                                                                                 Web                 (Linked data cloud) Structured
              ontologies are being
               used on the web?




Ontology Usage Catalogue
    (Michael Uschold)                                                           attribute: http://richard.cyganiak.de/2007/10/lod




                           http://www.cs.vu.nl/~frankh/spool/ISWC2011Keynote/
Future work


  • Build industry specific datasets to understand the ontology
    usage, data and knowledge patterns.
  • Automate the population of U Ontology
  • Publication of Ontology Usage catalogue
  • Recommendations to publishers and vocabulary designers
Thanks!

Questions………

A Framework for Ontology Usage Analysis

  • 1.
    A Framework forOntology Usage Analysis Jamshaid Ashraf jamshaid.ashraf@gmail.com Supervisor : Dr Omar Hussain School of Information Systems, Curtin University, Perth, Western Australia PhD symposium ESWC 2012, Heraklion, Crete, Greece (27- 31 May 2012)
  • 2.
    Knowledge focused [1999 –2006] - ONTOLOGY Instance data Ontologies Knowledge Focused •Ontology Languages •Ontology authoring tools •Reasoning •Ontology evaluation •Ontology evolution
  • 3.
    (Structured) Data focused Ontologies [2006 – to data] - LINKED DATA Linked Data Data Focused •Linked Data principles •Linked Open Data project •LOD cloud •RDFa •RDF data analysis
  • 4.
    Current state Ontol og y ata Li n ked d ….. searching less and using more
  • 5.
    Increase in theuse of ontologies 21 May 2012
  • 6.
    Lack of visibility -Index such as PingTheSemanticWeb does not provide a detailed view of ontology usage - In order to make effective and efficient use of semantic web data, we need to know which concepts and relationships and how are being used? - An insight into the structure, understand the pattern available, actual use and the intended use
  • 7.
    Ontology life cycle Ontology Ontology Dev. Lifecycle •Think •Design •Develop & evaluate •Deploy •Evangelize •Adoption! • Measure and analyze • Learn from it to influence future thinking and design
  • 8.
    Evaluate, measure andanalyse the use of ontologies on the Web
  • 9.
    Benefits of UsageAnalysis (1) Helps in providing usage-based feedback loop to the ontology maintenance process for a pragmatic conceptual model update (2) Assist in building data rich interfaces, exploratory search and exploratory data analysis (3) Provides erudite insight on the state of semantic structured data based on prevalent knowledge patterns for the consuming applications
  • 10.
    Ontology Usage AnalysisFramework (OUSAF) Identification (selection of ontologies) - Domain Ontology - Identify candidate ontology(ies) from dataset Investigation (analysing the use of ontology) - Usage/population/instantiation - Co-usability/schema-link graph Representation (represent the usage analysis ) - Conceptual model to represent ontology usage - Ontology Usage Catalogue Utilization (making use of usage analysis ) - Use case implementation - Publication of ontology usage analysis
  • 11.
    Metrics for measuringrichness >Concept Richness (CR): Describes the relationship with other concepts and the number of attributes to describe the instances >Relationship Value (RV): Reflects the possible role of an object property in creating typed relationship between different concepts >Attribute Value (RV): Reflects the number of concepts that have data properties used to provide values to instances
  • 12.
    Metrics for measuringusage >Concept Usage (CU): Measures the instantiation of the concept in the knowledge base CU(C) = |{t = (s, p, o)| p = rdf:type, o = C}|1 CUH(C) = |{t = (s, p, o)| p = rdf:type, o entailrdfs9(C)}| >Relationship Usage (RU): Calculates the number of triplets in a dataset in which object property is used to create relationships between different concept’s instances RU(P) = | { t:=(s,p,o) | p= P} | >Attribute Usage (RU): Measures how much data description is available in the knowledge base for a concept instance AU(A) = | { t:=(s,p,o) | p A, o L) |
  • 13.
    Structural properties Representontology usage as a bipartite network -Hidden properties in ontology usage network to identify cohesive groups and measure semanticity. -Study structural properties such as centrality, reciprocity, density and reachability Capture the knowledge patterns -Schema level patterns Hidden properties in ontology usage network to identify cohesive groups and measure semanticity. -Study structural properties such as centrality, reciprocity, density and reachability
  • 14.
    Initial Results –domain ontology usage GR data coverage
  • 15.
    Initial Results –use case Web Schema construction based on Ontology Usage Analysis Domain : eCommerce Dataset : 305 data sources (pay-level domains published ecommerce data) Ranking the terms
  • 16.
    U Ontology Ontology Usage Ontology (U Ontology) Goal : Capture the detail of ontologies and their usage Use cases : - publish the ontology usage details on the web. - generate prototypical SPARQL queries Reusing existing ontologies -Ontology Metadata Vocabulary (OMV) [1] -Ontology Application Framework (OAF) [2] -FOAF, DC [1] Hartmann, J., Palma, R., Sure, Y., Suárez-Figueroa, M.C., Haase P.: OMV– Ontology Metadata Vocabulary. In: The Ontology Patterns for the Semantic Web (OPSW) Workshop at ISWC 2005, Galway, Ireland (2005) [2] http://ontolog.cim3.net/file/work/OntologySummit2011/ApplicationFramework/OWL-Ontology/BenefitsAndTechniques- WithDocumentation.pdf
  • 17.
    Conclusion What and how Semantic Web data Web (Linked data cloud) Structured ontologies are being used on the web? Ontology Usage Catalogue (Michael Uschold) attribute: http://richard.cyganiak.de/2007/10/lod http://www.cs.vu.nl/~frankh/spool/ISWC2011Keynote/
  • 18.
    Future work • Build industry specific datasets to understand the ontology usage, data and knowledge patterns. • Automate the population of U Ontology • Publication of Ontology Usage catalogue • Recommendations to publishers and vocabulary designers
  • 19.

Editor's Notes

  • #10 Exploratory data analysis: s an approach to analyzing data sets to summarize their main characteristics in easy-to-understand form, often with visual graphs, without using a statistical model or having formulated a hypothesis
  • #18 What are we trying to achieve in this research? We have seen tremendous growth in the semantic web data (web-of-data) on the web. As a result of it now we have “structured data” on the web in the form of RDF, enabling “ machines ” to automatically understand the data and process it. Now, we have reached to the point where, the availability of semantic data on the web is enabling the possibility of conducting imperial analysis about the data, use of ontologies .