Your SlideShare is downloading. ×
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Approximate and Incremental Processing of Complex Queries against the Web of Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Approximate and Incremental Processing of Complex Queries against the Web of Data

269

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
269
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Approximate and Incremental Processing ofComplex Queries against the Web of DataThanh Tran, Günter Ladwig, Andreas WagnerDEXA 2011Institute of Applied Informatics and Formal Description Methods (AIFB)KIT – University of the State of Baden-Württemberg andNational Large-scale Research Center of the Helmholtz Association www.kit.edu
  • 2. Contents Approximate Introduction Overview & Incremental Evaluation Conclusion Processing Structure-based Approximate Result Entity Search Structure Refinement and Matching Computation2 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 3. INTRODUCTION3 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 4. Introduction – Data Model Resource Description Framework (RDF) conference a1 c1 authorOf super- authorOf vises name p2 p1 p5 P2 P5 worksAt name worksAt knows i1 u1 partOf p4 p3 super- name vises worksAt authorOf U1 a2 i2 conference partOf c2 u24 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 5. Introduction – Query Model Basic Graph Patterns Conjunctive queries over RDF data: graph pattern matching AIFB name KIT partOf name z u worksAt supervise w x y v name age author conf ICDE 295 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 6. Contribution Techniques for matching (basic) query patterns against graph- structured data have limits We might wish to trade completeness and exactness for responsiveness Our approach allows an “affordable” computation of an initial set of approximate results, which can be incrementally refined as needed.6 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 7. Contribution – Pipeline Overview Pipeline of operations where approximate results are refined incrementally Intermediate,Approximate Results Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation Entity & Structure Neighborhood Relation Index Index Index7 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 8. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation ENTITY SEARCH8 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 9. Entity Search Entity index Stores attribute edges of the data graph Enables lookup of entities by attribute and value Entity search Obtains candidate bindings for all variables in the query that have attribute edges Does not consider structure (i.e., relations between entities) Query decomposition and transformation Decompose query into entity queries to create a transformed query9 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 10. Query Decomposition & Transformation AIFB name KIT partOf name z u worksAt supervise w x y v age author conf name ICDE 29 Identify entity queries Breadth-first search starting from random variable10 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 11. Query Decomposition & Transformation AIFB name KIT partOf name z u worksAt supervise w x y v age author conf name ICDE 29 Collapse entity queries z partOf u name AIFB name KIT worksAt w supervise x y v age 29 author conf name ICDE11 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 12. Entity Search Results Use entity index to obtain bindings for all entity queries in transformed query Entity queries are necessary conditions, x z u v but not sufficient p1 i1 u1 c1 Final results will be a subset p3 i1 u1 c1 p5 i1 u1 c1 p6 i1 u1 c1 z partOf u name AIFB name KIT worksAt w supervise x y v age 29 author conf name ICDE12 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 13. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation APPROXIMATE STRUCTURE MATCHING13 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 14. Approximate Structure Matching Only entity parts of the query have been matched Relation edges have yet to be processed Instead of performing exact equijoins we propose to perform a neighborhood join The k-neighborhood of an entity e is the set of entities in the data graph that can be reached from e via a path of relation edges of length k or less. Neighborhood join allows us to check whether two entities are connected via relation edges (but not which ones) A neighborhood join between two sets of entities E1, E2 is an equijoin between all pairs e1 ∈ E1, e2 ∈ E2 where e1 and e2 are considered equivalent if the intersection of their k-neighborhood is non-empty. Again: necessary, but not sufficient14 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 15. Neighborhood Join via Bloom Filters We store the set of k-neighborhood entities as a bloom filter Bloom filter Space-efficient, probabilistic data structure for set membership test False positives are possible (false negatives are not) We refine the results of the previous step To perform a neighborhood join between bindings E1, E2 Load bloom filters for one set of entities, say E1 In a nested loop manner, check if entities in E2 are contained in the bloom filter15 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 16. Neighborhood Join via Bloom Filters AIFB name KIT partOf name z u worksAt supervise w x y v age author conf name ICDE 29 k=1 k=2 Load bloom filters for entities bound to x Check whether entities bound to w,y, z are in the neighborhood of x When k=2, bloom filters for x also cover u and v16 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 17. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation STRUCTURE-BASED RESULT REFINEMENT17 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 18. Structure-based Result Refinement From ASM we know that entities in intermediate results are connected Necessary, but not sufficient. With structure-based result refinement we find out whether they are connected via paths captured by query atoms Query is matched against a structure index graph Bisimulation-based summary of data graph that captures structural information Nodes in the data graph with the same “structure” are grouped together Much smaller than the data graph18 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 19. Structure Index Bisimulation conference a1 c1 authorOf super- authorOf vises p2 p1 p5 worksAt worksAt knows worksAt partOf E6 E3 E5 i1 u1 p5 i1,i2 u1, u2 partOf p4 p3 super- vises worksAt authorOf worksAt authorOf a2 i2 E1 E2 E4 E6 p2,p4 super- p1,p3 authorOf a1,a2 conference c1,c2 conference partOf vises c2 u2 knows Structure Index Graph G~ Data graph G19 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 20. Structure-based Result Refinement We take advantage of this property: Whenever there is a match of a query graph q on G the query also matches on G~. Moreover, extensions of the index graph matches will contain all data graph matches, i.e. the bindings to query variables. Match the query against the structure index graph to obtain sets of extensions that contain potential query answers Bindings computed in previous ES/ASM steps can only be answers if they are contained in the matched extensions20 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 21. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation STRUCTURE-BASED ANSWER COMPUTATION21 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 22. Structure-based Answer Compution Finally, results which exactly match the query are computed by the last refinement. Only for this step, we actually perform joins on the data.22 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 23. EVALUTION23 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 24. Evaluation Systems INC: the proposed approach VP: join processing using vertical partitioning with sextuple indexing Datasets DBLP: 13M triples LUBM: 0.7M – 6.7M triples Queries Generated 80 queries via random sampling Different shapes: path, star, graph24 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 25. Results – Average Processing Time25 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 26. Results – Average Processing Time Neighborhood Distance26 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 27. Results – Precision vs. Time27 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 28. Results - Precision28 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 29. Conclusion We proposed a novel process for approximate and incremental processing of complex graph pattern queries Initial results are computed in a small fraction of total time and the incrementally refined via approximate matching at low cost Increased responsiveness as inexact results are available early Users can decide if and for which result exactness and completeness is desirable Experiments show that our approach is relatively fast w.r.t. exact and complete results, indicating that the proposed mechanism is able to reuse intermediate results29 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 30. 30 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 31. BACKUP SLIDES31 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)

×