A Pragmatic Approach to  Semantic Repositories Benchmarking Dhaval Thakker  , Taha Osman, Shakti Gohil, Phil Lakin © Dhava...
Outline <ul><li>Introduction to the Semantic Technology Project at PA Images </li></ul><ul><li>Semantic Repository benchma...
http://www.pressassociation.com <ul><li>Press Association & its operations </li></ul><ul><ul><li>UK’s leading multimedia n...
Press Association Images Semantic Web project   Benchmarking   Results    Conclusions
Current Search Engine Semantic Web project   Benchmarking   Results    Conclusions
Browsing Engine <ul><li>Images of Sports, Entertainment, News domain entities: people, events, locations etc.  </li></ul><...
Semantic Repository Benchmarking <ul><li>“ a tool, which combines the functionality of an RDF-based DBMS and an inference ...
PA Dataset <ul><li>Ontology –  </li></ul><ul><ul><ul><ul><ul><li>OWL-lite to OWL-DL  </li></ul></ul></ul></ul></ul><ul><ul...
Published Benchmarks & datasets <ul><li>The Lehigh University Benchmark (LUBM) </li></ul><ul><ul><li>first standard platfo...
Benchmarking Parameters <ul><li>Classification of semantic stores in Native, Memory-based or Database-based (A) </li></ul>...
UOBM: Load time Semantic Web project   Benchmarking   Results    Conclusions
Load Time: PA Dataset Semantic Web project   Benchmarking   Results    Conclusions
Dataset Queries <ul><li>Measuring query execution speed </li></ul><ul><li>SPARQL queries </li></ul><ul><li>From JAVA based...
UOBM: Query execution Partially Answered N Query was not answered by this tool Semantic Web project   Benchmarking   Resul...
PA Dataset: Query execution Semantic Web project   Benchmarking   Results    Conclusions Partially Answered N Query was no...
Two complete stores: BigOWLIM v/s Allegrograph Semantic Web project   Benchmarking   Results    Conclusions
Two fast stores: BigOWLIM v/s Sesame Semantic Web project   Benchmarking   Results    Conclusions
Modification Tests: Insert Semantic Web project   Benchmarking   Results    Conclusions
Modification Tests: Update & Delete Semantic Web project   Benchmarking   Results    Conclusions
Conclusions <ul><li>PA Dataset benchmarking </li></ul><ul><ul><li>Essential and desirable requirements of our application ...
Upcoming SlideShare
Loading in …5
×

A Pragmatic Approach to Semantic Repositories Benchmarking

3,459 views
3,352 views

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,459
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
55
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

A Pragmatic Approach to Semantic Repositories Benchmarking

  1. 1. A Pragmatic Approach to Semantic Repositories Benchmarking Dhaval Thakker , Taha Osman, Shakti Gohil, Phil Lakin © Dhaval Thakker, Press Association , Nottingham Trent University
  2. 2. Outline <ul><li>Introduction to the Semantic Technology Project at PA Images </li></ul><ul><li>Semantic Repository benchmarking </li></ul><ul><ul><li>Parameters </li></ul></ul><ul><ul><li>Datasets </li></ul></ul><ul><li>Results and Analysis </li></ul><ul><ul><li>Loading and querying results </li></ul></ul><ul><ul><li>Modification tests </li></ul></ul><ul><li>Conclusions </li></ul>
  3. 3. http://www.pressassociation.com <ul><li>Press Association & its operations </li></ul><ul><ul><li>UK’s leading multimedia news & information provider </li></ul></ul><ul><ul><li>Core News Agency operation </li></ul></ul><ul><ul><li>Content and Editorial services: Sports data, entertainment guides, weather forecasting, photo syndication </li></ul></ul>Semantic Web project Benchmarking Results Conclusions
  4. 4. Press Association Images Semantic Web project Benchmarking Results Conclusions
  5. 5. Current Search Engine Semantic Web project Benchmarking Results Conclusions
  6. 6. Browsing Engine <ul><li>Images of Sports, Entertainment, News domain entities: people, events, locations etc. </li></ul><ul><li>Lacks a meta-data rich browsing engine functionality that can utilize these entities for a greater browsing experience. </li></ul><ul><li>To help the searchers to browse through images based on these entities and their relationships </li></ul><ul><li>Semantic web based browsing engine </li></ul>Semantic Web project Benchmarking Results Conclusions
  7. 7. Semantic Repository Benchmarking <ul><li>“ a tool, which combines the functionality of an RDF-based DBMS and an inference engine and can store data and evaluate queries, regarding the semantics of ontologies and metadata schemata.” * </li></ul><ul><li>Criteria for selection: </li></ul><ul><ul><li>The analytical parameters, such as expected level of reasoning and query language support </li></ul></ul><ul><li>Selected semantic repositories </li></ul><ul><ul><li>AllegroGraph, BigOWLIM, Oracle, Sesame,TDB Jena, Virtuoso </li></ul></ul>Semantic Web project Benchmarking Results Conclusions * Kiryakov, A, Measurable Targets for Scalable Reasoning., Ontotext Technology White Paper, Nov 2007. 
  8. 8. PA Dataset <ul><li>Ontology – </li></ul><ul><ul><ul><ul><ul><li>OWL-lite to OWL-DL </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Classification, subproperties, inverse properties and hasValue for automatic classification </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>147 classes, 60 object properties and 30 data properties </li></ul></ul></ul></ul></ul><ul><li>Knowledge base – </li></ul><ul><ul><ul><ul><ul><li>Entities </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>6.6 M triples </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Approx 1.2M entities </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Disk space: 1.23 GB </li></ul></ul></ul></ul></ul><ul><li>Image annotations - </li></ul><ul><ul><ul><ul><ul><li>Each Image: 2-4 triples. </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>8M triples </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Approx 5 M images </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Disk space: 1.57 GB </li></ul></ul></ul></ul></ul>Semantic Web project Benchmarking Results Conclusions
  9. 9. Published Benchmarks & datasets <ul><li>The Lehigh University Benchmark (LUBM) </li></ul><ul><ul><li>first standard platform to benchmark OWL systems </li></ul></ul><ul><ul><li>but it gradually fell behind with the increasing expressivity of OWL reasoning </li></ul></ul><ul><li>The University Ontology Benchmark (UOBM) benchmark </li></ul><ul><ul><li>improve the reasoning coverage of LUBM </li></ul></ul><ul><ul><li>OWL-DL and OWL-Lite inferencing </li></ul></ul><ul><li>Berlin SPARQL benchmark (BSBM) </li></ul><ul><ul><li>BSBM focuses provides comprehensive evaluation for SPARQL query features. </li></ul></ul>Semantic Web project Benchmarking Results Conclusions
  10. 10. Benchmarking Parameters <ul><li>Classification of semantic stores in Native, Memory-based or Database-based (A) </li></ul><ul><li>Forward-chaining or backward-chaining (A) </li></ul><ul><li>Load Time (P) </li></ul><ul><li>Query Response time (P) </li></ul><ul><li>Query results analysis (P) </li></ul><ul><li>RDF store update tests (P) </li></ul><ul><li>Study different serialisation and impact on performance </li></ul><ul><li>Scalability (A) </li></ul><ul><li>Reasoner Integration (A) </li></ul><ul><li>Query Language supported (A) </li></ul><ul><li>Clustering supported (A) </li></ul><ul><li>Programming Languages support (A) </li></ul><ul><li>Platform support (A) </li></ul><ul><li>RDFview support (support for non-rdf data) (A) </li></ul>Semantic Web project Benchmarking Results Conclusions
  11. 11. UOBM: Load time Semantic Web project Benchmarking Results Conclusions
  12. 12. Load Time: PA Dataset Semantic Web project Benchmarking Results Conclusions
  13. 13. Dataset Queries <ul><li>Measuring query execution speed </li></ul><ul><li>SPARQL queries </li></ul><ul><li>From JAVA based client </li></ul><ul><li>Measured execution speed based on three runs </li></ul><ul><li>PA Dataset: </li></ul><ul><ul><ul><li>13 queries to test the expressiveness supported </li></ul></ul></ul><ul><ul><ul><li>Subsumption, Inverse properties (Q6,Q12, Q15), Automatic classification </li></ul></ul></ul><ul><li>UOBM Dataset: </li></ul><ul><ul><ul><li>15 queries - 12 queries fall under OWL-Lite and 3 queries are of OWL-DL expressivity </li></ul></ul></ul><ul><ul><ul><li>Q5 and Q7 involves transitive ( owl:TransitiveProperty ) </li></ul></ul></ul><ul><ul><ul><li>Q6 relies on semantic repositories to support ( owl:inverseOf) </li></ul></ul></ul><ul><ul><ul><li>Q10 requires symmetric </li></ul></ul></ul>Semantic Web project Benchmarking Results Conclusions
  14. 14. UOBM: Query execution Partially Answered N Query was not answered by this tool Semantic Web project Benchmarking Results Conclusions N N N N N N Q15 0.016 N N N N(infinite) N Q14 N N N N N N Q13 0.016 N N N 476.507 N Q12 0.062 N(infinite) 0.094(P) 0.001(P) N(infinite) N Q11 0.016 0.001 0.001 0.001(P) 0.25 0 Q10 0.031(P) N N N N N Q9 0.031 N N N 6.843(P) N Q8 0.001 N N N 300.12 N Q7 0.047 N N N 1153.025 N Q6 0.047 N N N 1.281 N Q5 0.063 120 0.14 N N(infinite) N Q4 0.062 0.016 0.109 N 651.237 N Q3 0.062 0.001(P) 0.001(P) N 8.906(P) N Q2 0.047 0.031 0.203 0.141 21.921 6.766 (P) Q1 BigOWLIM Jena TDB Sesame Oracle Allegrograph Virtuoso No. Execution Timings (seconds)  
  15. 15. PA Dataset: Query execution Semantic Web project Benchmarking Results Conclusions Partially Answered N Query was not answered by this tool 0.031 N N 1.688 N Q15 0.641 0.001 0.016(P) 1.812 5.563(P) Q13 0.079 N N 16.14 N Q12 0.062 0.001 0.11 1.734 N Q11 0 N 0.047 1.734 0.001 Q10 0.016 N 0.171 1.782 0.156 Q9 0.062 0.001 0.11 3.39 0.047 Q8 0.093 N 0.203 28.688 84.469 Q7 0.45 0.001 N 3.765 N Q6 0.078 N 0.141 1.719 0.172 Q5 0.047 N N N N Q4 0.063 N N N N Q2 0.219 0.047 0.469(P) 26.422 2.234 (P) Q1 BigOWLIM Jena TDB Sesame Allegrograph Virtuoso Query No.
  16. 16. Two complete stores: BigOWLIM v/s Allegrograph Semantic Web project Benchmarking Results Conclusions
  17. 17. Two fast stores: BigOWLIM v/s Sesame Semantic Web project Benchmarking Results Conclusions
  18. 18. Modification Tests: Insert Semantic Web project Benchmarking Results Conclusions
  19. 19. Modification Tests: Update & Delete Semantic Web project Benchmarking Results Conclusions
  20. 20. Conclusions <ul><li>PA Dataset benchmarking </li></ul><ul><ul><li>Essential and desirable requirements of our application into a set of functional (practical) and non-functional (analytical) parameters </li></ul></ul><ul><ul><li>To consolidate our findings we use UOBM, a public benchmark that satisfies the requirements of our target system </li></ul></ul><ul><li>Analysis </li></ul><ul><ul><li>All the repositories are sound </li></ul></ul><ul><ul><li>..however not complete </li></ul></ul><ul><ul><li>BigOWLIM provides the best average query response time and answers maximum number of queries for both the datasets. But Slower in loading, modification tests. </li></ul></ul><ul><ul><li>Sesame, Jena, Virtuoso and Oracle offered sub-second query response time for the majority of queries they answer. </li></ul></ul><ul><ul><li>Allegrograph answers more queries than the former four repositories hence offers better coverage of OWL properties. Average query response time for Allegrograph was the highest for both the dataset </li></ul></ul><ul><li>Further work </li></ul><ul><ul><li>Expanding this benchmark exercise to billion triples </li></ul></ul><ul><ul><li>More repositories </li></ul></ul><ul><ul><li>Adding extra benchmarking parameters such as the performance impact of concurrent users and transaction-related operations </li></ul></ul>Semantic Web project Benchmarking Results Conclusions

×