Supporting End Users In The Creation Of Dependable Web Clips

  • 362 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
362
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Supporting End Users in the Creation of Dependable Web Clips Sandeep Lingam, Sebastian Elbaum Proceedings of the 16th international conference on World Wide Web (WWW2007) Reporter: Shih-Feng Yang 2007/7/2
  • 2. Outline
    • Introduction
    • Web Clipper
    • Evaluation
    • Conclusion
  • 3. Introduction
    • Web authoring environments have enabled end-users who are non-programmers to design and quickly construct web pages.
    • Web clip : a component within the end-user’s website which can dynamically extract information from other web-sources.
  • 4. Introduction Web Clip
  • 5. Introduction
    • Goal
      • Web clipper : An approach to support end-users through the entire process of creating a dependable web clip.
      • Three fundamental aspects:
        • Our tool will be embedded in the web authoring tool interface .
        • Training : increase the robustness of the web clip.
        • Deploy multiple filters to increase the confidence in the correctness of the retrieved information.
  • 6. Introduction
    • Challenges
      • We can’t expect end-users to have any programming experience about web clip.
      • The content within the target site of a web clip will change.
  • 7. Web Clipper
    • Approach Overview
  • 8. Web Clipper -Clipping
    • Target Clip Selection
      • There is a custom browser for controlling the web clip.
      • Every extractable document element is highlighted when the user moves the mouse, and the user can make a selection by clicking on it.
    • Extraction Pattern
      • Once a selection is made, an extraction pattern is generated.
      • During the clipping process, the user’s selection is uniquely identified by its HTML-Path .
      • HTML-Path : a specialized XPATH expression.
  • 9. Web Clipper -Clipping
  • 10. Web Clipper -Training
    • To increase the robustness of the web clip, they construct extraction patterns which uniquely characterize the end-user selection.
    • Several clips will created using different extraction patterns.
    • Every time the user marks a clipping as valid, the system generates a filter corresponding to the clipping.
      • Filter: Javascript code, embedded within the user’s web page.
  • 11. Web Clipper -Training Validation of the extraction patterns presented by the system.
  • 12. Web Clipper -Training
    • Extraction Patterns
  • 13. Web Clipper -Training
  • 14. Web Clipper -Deployment
    • The URL and extraction patterns of the clipped content are used to generate an AJAX script .
    • HTML documents -> XHTML.
    • Relative URLs -> absolute URLs.
    • Generate filters from pre-defined templates for each of the extraction patterns during training.
    • The user can move, resize or annotate the web clip to suit her preference.
  • 15. Web Clipper -Filtering and Assessment The content which the user want to see in the web clip
  • 16. Web Clipper -Filtering and Assessment
  • 17. Web Clipper -Filtering and Assessment
  • 18. Web Clipper -Filtering and Assessment
    • Then the paper defined Confidence
      • The ratio of the maximum filter score of all valid extraction patterns generated during the training section.
      • The prototype will alert the user when the content within the target site changes.
      • The user can also configure the web clips to provide alerts when the confidence scores fall below a particular threshold.
  • 19. Web Clipper -Filtering and Assessment Label filter has the highest score, so The system will use this pattern to extract content, and the confidence score = 2/3 = 67%
  • 20. Web Clipper -Filtering and Assessment Alert the user when the content within the target site changes
  • 21. Evaluation
    • Effectiveness of the extraction patterns used in generating web clips.
    • Dependability of web clips in providing sufficiently correct information over time.
    • Robustness of web clips to changes in the clipped web site.
  • 22. Evaluation
    • Effectiveness of extraction patterns
  • 23. Evaluation
    • Dependability of web clips
    confidence scores
  • 24. Evaluation
    • Robustness
      • This experiment will test the degree to which the web clips change:
        • Block Insertion
        • Block Movement
        • Block Deletion
        • Enclosing Element Changes
        • Target Clipping Removed
  • 25. Evaluation
    • Robustness
  • 26. Conclusion
    • This paper presented an approach to support end-users through the entire process of creating a dependable web clip.
    • Web clipper addresses the shortcomings of existing tools by introducing the notion of training and of dynamic confidence evaluation .
  • 27. Finish
    • Thanks for your patience!