Benchmarking        Linked Open Data technologySRbench: A Benchmark for Streaming RDF Storage Engines        Ying Zhang, P...
What is Database Benchmarking?Standard test to measure and understand how technology performs Dataset definition        ...
Why Benchmarking?   make competing products comparable   accelerate progress, make technology viable                    ...
Benchmarking LOD TechnologyLOD = Linked Open Data web addressable data RDF data format (                                ...
* tentative/expected project               LDBC: FP7 2012-2015vendor cooperation to establish accepted RDF/Graph  database...
LDBC Goals  1. Create the LDBC Foundation of graph and RDF DB     vendors  2. Equip de LDBC Foundation with a good initial...
Benchmarking           Linked Open Data technologySRbench: A Benchmark for Streaming RDF Storage Engines          Ying Zha...
SRbench: Streaming RDF BenchmarkTraditional Database System vs.                                                     Stream...
Data Streams (1/4): Stock Market
Data Streams (2/4): Social Chatter     Detect breaking news     Analyze Marketing campaigns    Ying Zhang, Peter Boncz –...
Data Streams (3/4): Car Traffic     monitor positions and speeds of cars detect accidents, traffic jams     Application...
Data Streams (4/4): Tele HealthMonitor health of elderly in their homes                              Who are the users?   ...
SRbench: Streaming RDF BenchmarkStreaming RDF data benefits: apply Linked Open Data (LOD) principles to streaming data   ...
SRbench: Streaming RDF BenchmarkStreaming RDF data challenges: Proper benchmark dataset   use real-world datasets from L...
Use case: wheather information applicationSRbench: used Datasets                                           LinkedSensorDat...
SRBench Queries
Summary    the importance of        Database System Benchmarking        RDF Database System Benchmarking (             ...
Thank You!Questions?   Ying Zhang (zhang@cwi.nl)   Peter Boncz (boncz@cwi.nl)    Ying Zhang, Peter Boncz – Benchmarking ...
Upcoming SlideShare
Loading in …5
×

EDF2012 Peter Boncz - LOD benchmarking SRbench

1,320 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,320
On SlideShare
0
From Embeds
0
Number of Embeds
272
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

EDF2012 Peter Boncz - LOD benchmarking SRbench

  1. 1. Benchmarking Linked Open Data technologySRbench: A Benchmark for Streaming RDF Storage Engines Ying Zhang, Peter Boncz (CWI, Amsterdam)
  2. 2. What is Database Benchmarking?Standard test to measure and understand how technology performs Dataset definition  at various scales (100GB, 300GB, 1TB, 3TB, etc)  mimicks a recognizable relevant usage scenario Database Queries  often between 10-100 queries, with parameters  + rules/programs that specify how these queries are posed Result Metrics  a number to understand the result  tps = “transactions/second” $/QphH@size = “price per query per hour” Audit Rules  allow results to be checked by independent auditors  prevent/limit cheating Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  3. 3. Why Benchmarking? make competing products comparable accelerate progress, make technology viable © Jim Gray, 2005Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  4. 4. Benchmarking LOD TechnologyLOD = Linked Open Data web addressable data RDF data format ( ) lots of useful data on the web (“LOD cloud”)LOD technology (SPARQL) benchmarks: BSBM, DBpedia Benchmark, SIB SRbench  topic of this talk New industry cooperation: Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  5. 5. * tentative/expected project LDBC: FP7 2012-2015vendor cooperation to establish accepted RDF/Graph database benchmarks and benchmark results6/9/2012 5
  6. 6. LDBC Goals 1. Create the LDBC Foundation of graph and RDF DB vendors 2. Equip de LDBC Foundation with a good initial set of benchmarks, and benchmark resultsspin-off 6/9/2012 6
  7. 7. Benchmarking Linked Open Data technologySRbench: A Benchmark for Streaming RDF Storage Engines Ying Zhang, Peter Boncz (CWI, Amsterdam)
  8. 8. SRbench: Streaming RDF BenchmarkTraditional Database System vs. Stream Database System stream stream stream of of of queries queries data queries stream “pull” based query answering Persistent Persistent Queries “push” based Data “continuous queries” query answeringYing Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  9. 9. Data Streams (1/4): Stock Market
  10. 10. Data Streams (2/4): Social Chatter Detect breaking news Analyze Marketing campaigns Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  11. 11. Data Streams (3/4): Car Traffic monitor positions and speeds of cars detect accidents, traffic jams Applications: better safety, improved logistics Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  12. 12. Data Streams (4/4): Tele HealthMonitor health of elderly in their homes Who are the users? Why?- Difficult to reach locations- Make health care more affordable How? Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  13. 13. SRbench: Streaming RDF BenchmarkStreaming RDF data benefits: apply Linked Open Data (LOD) principles to streaming data  Link streaming data to data on the web (enrichment)  Publish data streams on the web support (simple) reasoning semantics in stream queries Richer semantics than relational streaming database systems Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  14. 14. SRbench: Streaming RDF BenchmarkStreaming RDF data challenges: Proper benchmark dataset  use real-world datasets from LOD No standard query language  natural language query definition + three implementations (SPARQLStream, CQELS, C-SPARQL) Limited systems support  evaluate on the strRS system (UPM) Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  15. 15. Use case: wheather information applicationSRbench: used Datasets LinkedSensorData LinkedObservationData LinkedSensorMetaData om-owl:procedure Observation System om-owl:result om-owl:hasLocatedNearRelom-owl:samplingTime ResultData om-owl:processLocationInstant MeasureData TruthData Point LocatedNearRel DBpedia GeoNames om-owl:hasLocation owl:sameAs Airport FeatureYing Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  16. 16. SRBench Queries
  17. 17. Summary the importance of  Database System Benchmarking  RDF Database System Benchmarking ( )  Streaming RDF Database System Benchmarking SRbench  Developed in PlanetData (CWI, UPM)  First dedicated streaming RDF/SPARQL benchmark SRbench future work:  performance evaluation  results verification (not easy!) Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen
  18. 18. Thank You!Questions? Ying Zhang (zhang@cwi.nl) Peter Boncz (boncz@cwi.nl) Ying Zhang, Peter Boncz – Benchmarking Linked Open Data Technology June 7, 2012 @EDF Copenhagen

×