Supporting End Users In The Creation Of Dependable Web Clips


Published on

Published in: Economy & Finance, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Supporting End Users In The Creation Of Dependable Web Clips

  1. 1. Supporting End Users in the Creation of Dependable Web Clips Sandeep Lingam, Sebastian Elbaum Proceedings of the 16th international conference on World Wide Web (WWW2007) Reporter: Shih-Feng Yang 2007/7/2
  2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Web Clipper </li></ul><ul><li>Evaluation </li></ul><ul><li>Conclusion </li></ul>
  3. 3. Introduction <ul><li>Web authoring environments have enabled end-users who are non-programmers to design and quickly construct web pages. </li></ul><ul><li>Web clip : a component within the end-user’s website which can dynamically extract information from other web-sources. </li></ul>
  4. 4. Introduction Web Clip
  5. 5. Introduction <ul><li>Goal </li></ul><ul><ul><li>Web clipper : An approach to support end-users through the entire process of creating a dependable web clip. </li></ul></ul><ul><ul><li>Three fundamental aspects: </li></ul></ul><ul><ul><ul><li>Our tool will be embedded in the web authoring tool interface . </li></ul></ul></ul><ul><ul><ul><li>Training : increase the robustness of the web clip. </li></ul></ul></ul><ul><ul><ul><li>Deploy multiple filters to increase the confidence in the correctness of the retrieved information. </li></ul></ul></ul>
  6. 6. Introduction <ul><li>Challenges </li></ul><ul><ul><li>We can’t expect end-users to have any programming experience about web clip. </li></ul></ul><ul><ul><li>The content within the target site of a web clip will change. </li></ul></ul>
  7. 7. Web Clipper <ul><li>Approach Overview </li></ul>
  8. 8. Web Clipper -Clipping <ul><li>Target Clip Selection </li></ul><ul><ul><li>There is a custom browser for controlling the web clip. </li></ul></ul><ul><ul><li>Every extractable document element is highlighted when the user moves the mouse, and the user can make a selection by clicking on it. </li></ul></ul><ul><li>Extraction Pattern </li></ul><ul><ul><li>Once a selection is made, an extraction pattern is generated. </li></ul></ul><ul><ul><li>During the clipping process, the user’s selection is uniquely identified by its HTML-Path . </li></ul></ul><ul><ul><li>HTML-Path : a specialized XPATH expression. </li></ul></ul>
  9. 9. Web Clipper -Clipping
  10. 10. Web Clipper -Training <ul><li>To increase the robustness of the web clip, they construct extraction patterns which uniquely characterize the end-user selection. </li></ul><ul><li>Several clips will created using different extraction patterns. </li></ul><ul><li>Every time the user marks a clipping as valid, the system generates a filter corresponding to the clipping. </li></ul><ul><ul><li>Filter: Javascript code, embedded within the user’s web page. </li></ul></ul>
  11. 11. Web Clipper -Training Validation of the extraction patterns presented by the system.
  12. 12. Web Clipper -Training <ul><li>Extraction Patterns </li></ul>
  13. 13. Web Clipper -Training
  14. 14. Web Clipper -Deployment <ul><li>The URL and extraction patterns of the clipped content are used to generate an AJAX script . </li></ul><ul><li>HTML documents -> XHTML. </li></ul><ul><li>Relative URLs -> absolute URLs. </li></ul><ul><li>Generate filters from pre-defined templates for each of the extraction patterns during training. </li></ul><ul><li>The user can move, resize or annotate the web clip to suit her preference. </li></ul>
  15. 15. Web Clipper -Filtering and Assessment The content which the user want to see in the web clip
  16. 16. Web Clipper -Filtering and Assessment
  17. 17. Web Clipper -Filtering and Assessment
  18. 18. Web Clipper -Filtering and Assessment <ul><li>Then the paper defined Confidence </li></ul><ul><ul><li>The ratio of the maximum filter score of all valid extraction patterns generated during the training section. </li></ul></ul><ul><ul><li>The prototype will alert the user when the content within the target site changes. </li></ul></ul><ul><ul><li>The user can also configure the web clips to provide alerts when the confidence scores fall below a particular threshold. </li></ul></ul>
  19. 19. Web Clipper -Filtering and Assessment Label filter has the highest score, so The system will use this pattern to extract content, and the confidence score = 2/3 = 67%
  20. 20. Web Clipper -Filtering and Assessment Alert the user when the content within the target site changes
  21. 21. Evaluation <ul><li>Effectiveness of the extraction patterns used in generating web clips. </li></ul><ul><li>Dependability of web clips in providing sufficiently correct information over time. </li></ul><ul><li>Robustness of web clips to changes in the clipped web site. </li></ul>
  22. 22. Evaluation <ul><li>Effectiveness of extraction patterns </li></ul>
  23. 23. Evaluation <ul><li>Dependability of web clips </li></ul>confidence scores
  24. 24. Evaluation <ul><li>Robustness </li></ul><ul><ul><li>This experiment will test the degree to which the web clips change: </li></ul></ul><ul><ul><ul><li>Block Insertion </li></ul></ul></ul><ul><ul><ul><li>Block Movement </li></ul></ul></ul><ul><ul><ul><li>Block Deletion </li></ul></ul></ul><ul><ul><ul><li>Enclosing Element Changes </li></ul></ul></ul><ul><ul><ul><li>Target Clipping Removed </li></ul></ul></ul>
  25. 25. Evaluation <ul><li>Robustness </li></ul>
  26. 26. Conclusion <ul><li>This paper presented an approach to support end-users through the entire process of creating a dependable web clip. </li></ul><ul><li>Web clipper addresses the shortcomings of existing tools by introducing the notion of training and of dynamic confidence evaluation . </li></ul>
  27. 27. Finish <ul><li>Thanks for your patience! </li></ul>