Querying Capability Modeling

1. CS-690 Fall-2009 Querying Capability Modelling and Construction of Deep Web Sources Paper by: Liangcai Shu, Weiyi Meng, Hai Fe, and Clement Yu Presentation by: Fabian Alenius

3. Information in Deep Web is 400-550 times larger than the surface web

4. Crawlers need to extract

5. information

7. A form submission consists of a set of values, one for every field of the form

9. An attribute is a field in a form

10. A query is a set of attributes

11. A set query instances are a set of attribute and value pairs, e.g. {<Author, Tolkien>, <Title, Bilbo>}

12. Four types of attributes Functional Attribute Range Attribute Categorical Attribute Value-Infinite Attribute

14. Conditional dependency – value of C insignificant if A is null

15. A -> C

17. T is not attribute type, but value type (e.g. phone, zip code, etc.)

18. Valid queries – accepted; Invalid queries – rejected

20. Formally: Given valid query S = {AU 1 ,..., Au n }, atomic iff any query T ⊂ S is invalid

21. Goal is to find Atomic Queries

22. Assign value only if attribute is part of query

24. {<Author, Tolkien>}

25. {<Title, Bilbo>}

26. {<ISBN, 9780140285000 >}

27. {<Publisher, Penguin>}

29. <Query> ::= <AtomicQuery> [<OptionalQuery>]

30. …

31. <CheckboxGroup> ::= <Checkbox> | ADJ(<Checkbox>, <CheckboxGroup>)

32. Used to determine if Query is valid

33. Interpretation tree example

34. System overview

36. Exploit proper subset property

37. Start with empty set and add attributes, stop at acceptance

40. Large result page size

42. Similarity to original search interface

43. Small result page size

45. TEL-I dataset

46. Total of 64 sources

47. Result

49. Bigger dataset

Querying Capability Modeling

Recommended

Recommended

More Related Content

Similar to Querying Capability Modeling

Similar to Querying Capability Modeling (20)

Recently uploaded

Recently uploaded (20)

Querying Capability Modeling

Editor's Notes