Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PDQ Poster

400 views

Published on

PDQ: Proof-driven Query Answering over Web=based Data

Abstract: The data needed to answer queries is often available through Web-based APIs. Indeed, for a given query there may be many Web-based sources which can be used to answer it, with the sources overlapping in their vocabularies, and differing in their access restrictions (required arguments) and cost.
We introduce PDQ (Proof-Driven Query Answering), a system for determining a query plan in the presence of web-based sources. It is: (i) constraint-aware -- exploiting relationships between sources to rewrite an expensive query into a cheaper one, (ii) access-aware -- abiding by any access restrictions known in the sources, and (iii) cost-aware -- making use of any cost information that is available about services.
PDQ takes the novel approach of generating query plans from proofs that a query is answerable. We demonstrate the use of PDQ and its effectiveness in generating low-cost plans.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

PDQ Poster

  1. 1. Finding Plans from Proofs PDQ: Proof-driven Query Answering over Web-based Data Michael Benedikt, Julien Leblay, Efthymia Tsamoura - Oxford University Supported by EPSRC grant EP/H017690/1, Query-driven Data Acquisition from Web-based Data Sources Project homepage: http://www.cs.ox.ac.uk/projects/pdq/ Contact: firstname.lastname@cs.ox.ac.uk Example: online services for geographic information r1: Places(id, name, type, coordinates, ...) information about places (e.g. city, country, continent, lake, etc.) r2: BelongsTo(source, target) containment between places, "China belongs to Asia". r3: Countries(id, name, iso_code, ...) information about countries. φ1:Places(x, y, Country, ...) ↔ Countries(x, y, ...) Query for countries in Asia: not answerable without considering constraints. SELECT p1.name FROM BelongsTo AS bt JOIN Places AS p1 ON p1.id=bt.source JOIN Places AS p2 ON p2.id=bt.target WHERE p1.type = ’Country’ AND p2.name = ’Asia’ Pre-processing steps create auxiliary schema by adding relations InferredAccPlaces, InferredAccBelongsTo, InferredAccCountries, Accessible and constraints: φ’1: InferredAccPlaces(x, y, Country, ...) ↔ InferredAccCountries(x, y,...) α1: Accessible(y)∧Places(x, y , z, ...) → InferredAccPlaces(x, y, z, ...)∧Accessible(x)∧Accessible(z)∧... α2: Accessible(x)∧BelongsTo(x, y) → InferredAccBelongsTo(x, y)∧Accessible(y) α3: Countries(x, y, z, ...) → InferredAccCountries(x, y, z, ...)∧Accessible(x)∧… α4: … Context Web data sources which may have: •overlapping information, •access restrictions. As a result: •There may be no web query plan for a given user query. •There may be many plans using different sources with different costs. Need to reason about Integrity constraints and access limitations. PDQ System for determining a query plan in the presence of web-based sources. i.constraint-aware ii.access-aware – abiding by access restrictions, iii.cost-aware – making use of any cost information Approach: generating query plans from proofs that a query is answerable. Input S: Schema 〈R, Σ〉, R set of relations with access methods (free, limited, inaccessible), Σ set of integrity constraints (TGDs). Q: Conjunctive query over S. f: Cost function on evaluation plans. Output Pbest: plan with minimal cost. Step 1: Pre-processing S augmented with new relations and axioms modelling the access restrictions. A goal query Qinferred is created based on the relations of the augmented schema. Q is grounded to form the initial state of the plan search. Step 2: Basic search step Each state is closed under firing of rules (blue arrows) other than accessibility axioms (denoted αi). Every possible firing of accessibility axioms (red arrows) gives a new candidate state, inheriting all the facts of its ancestors. Step 3: Plans and costs Each new state gives a plan, to which a cost is assigned (orange circles). If state corresponds to a match with Qinferred and its plan’s cost is lower than the best so far, it becomes the new best state. Queries over Web Data Architecture & User Experience User interface for creating and editing schemas and queries Interactive exploration of the planner’s search space. Online execution of plans. User interface for creating and configuring planning sessions. Dashboard Architecture Runtime Planner InferredAccPlaces(id2, "Asia", c2, …), Accessible(id2), Accessible(c2), … T’1 ⇐ Places ⇐ ("퐴푠푖푎") InferredAccPlaces(id2, "Asia", c2, …), Accessible(id2), Accessible(c2), … T2 ⇐ Places ⇐("퐴푠푖푎") T3 := T1 ⋈ T2 InferredAccBelongsTo(id1, id2) T4 ⇐ BelongsTo ⇐ π source (T3) T5 := π name ( T3 ⋈ T4 ) Places(id1, name1, "Country", …), Places(id2, "Asia", …), BelongsTo(id1, id2), Accessible("Asia"), Accessible("Country") Initial State Countries(id1, name1, c1, …) φ1 Goal : Qinferred(name) ← InferredAccPlaces(id1, name1, "Country", …) ∧ InferredAccPlaces(id2, "Asia", …)∧ InferredAccBelongsTo(id1, id2) φ‘1 α1 α1 α2 α3 InferredAccCountries(id1, name1, c1, …), Accessible(id1), Accessible(name1), Accessible(c1) T1 ⇐ Countries ⇐ Ø InferredAccPlaces(id1, name1, "Country", …) InferredAccCountries(id1, name1, c1, …), Accessible(id1), Accessible(name1), … T’2 ⇐ Countries ⇐ Ø T’3 := T’1 ⋈ T’2 InferredAccPlaces(id1, name1, "Country", …) φ‘1 α3 InferredAccBelongsTo(id1, id2) T’4 ⇐ BelongsTo ⇐ π source (T’3) T‘5 := π name ( T‘3 ⋈ T‘4 ) α2 3 2 25 35 45 55 Models free access on Countries

×