Supporting End Users In The Creation Of Dependable Web Clips
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
737
On Slideshare
733
From Embeds
4
Number of Embeds
1

Actions

Shares
Downloads
6
Comments
0
Likes
1

Embeds 4

http://web204seminar.blogspot.com 4

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Supporting End Users in the Creation of Dependable Web Clips Sandeep Lingam, Sebastian Elbaum Proceedings of the 16th international conference on World Wide Web (WWW2007) Reporter: Shih-Feng Yang 2007/7/2
  • 2. Outline
    • Introduction
    • Web Clipper
    • Evaluation
    • Conclusion
  • 3. Introduction
    • Web authoring environments have enabled end-users who are non-programmers to design and quickly construct web pages.
    • Web clip : a component within the end-user’s website which can dynamically extract information from other web-sources.
  • 4. Introduction Web Clip
  • 5. Introduction
    • Goal
      • Web clipper : An approach to support end-users through the entire process of creating a dependable web clip.
      • Three fundamental aspects:
        • Our tool will be embedded in the web authoring tool interface .
        • Training : increase the robustness of the web clip.
        • Deploy multiple filters to increase the confidence in the correctness of the retrieved information.
  • 6. Introduction
    • Challenges
      • We can’t expect end-users to have any programming experience about web clip.
      • The content within the target site of a web clip will change.
  • 7. Web Clipper
    • Approach Overview
  • 8. Web Clipper -Clipping
    • Target Clip Selection
      • There is a custom browser for controlling the web clip.
      • Every extractable document element is highlighted when the user moves the mouse, and the user can make a selection by clicking on it.
    • Extraction Pattern
      • Once a selection is made, an extraction pattern is generated.
      • During the clipping process, the user’s selection is uniquely identified by its HTML-Path .
      • HTML-Path : a specialized XPATH expression.
  • 9. Web Clipper -Clipping
  • 10. Web Clipper -Training
    • To increase the robustness of the web clip, they construct extraction patterns which uniquely characterize the end-user selection.
    • Several clips will created using different extraction patterns.
    • Every time the user marks a clipping as valid, the system generates a filter corresponding to the clipping.
      • Filter: Javascript code, embedded within the user’s web page.
  • 11. Web Clipper -Training Validation of the extraction patterns presented by the system.
  • 12. Web Clipper -Training
    • Extraction Patterns
  • 13. Web Clipper -Training
  • 14. Web Clipper -Deployment
    • The URL and extraction patterns of the clipped content are used to generate an AJAX script .
    • HTML documents -> XHTML.
    • Relative URLs -> absolute URLs.
    • Generate filters from pre-defined templates for each of the extraction patterns during training.
    • The user can move, resize or annotate the web clip to suit her preference.
  • 15. Web Clipper -Filtering and Assessment The content which the user want to see in the web clip
  • 16. Web Clipper -Filtering and Assessment
  • 17. Web Clipper -Filtering and Assessment
  • 18. Web Clipper -Filtering and Assessment
    • Then the paper defined Confidence
      • The ratio of the maximum filter score of all valid extraction patterns generated during the training section.
      • The prototype will alert the user when the content within the target site changes.
      • The user can also configure the web clips to provide alerts when the confidence scores fall below a particular threshold.
  • 19. Web Clipper -Filtering and Assessment Label filter has the highest score, so The system will use this pattern to extract content, and the confidence score = 2/3 = 67%
  • 20. Web Clipper -Filtering and Assessment Alert the user when the content within the target site changes
  • 21. Evaluation
    • Effectiveness of the extraction patterns used in generating web clips.
    • Dependability of web clips in providing sufficiently correct information over time.
    • Robustness of web clips to changes in the clipped web site.
  • 22. Evaluation
    • Effectiveness of extraction patterns
  • 23. Evaluation
    • Dependability of web clips
    confidence scores
  • 24. Evaluation
    • Robustness
      • This experiment will test the degree to which the web clips change:
        • Block Insertion
        • Block Movement
        • Block Deletion
        • Enclosing Element Changes
        • Target Clipping Removed
  • 25. Evaluation
    • Robustness
  • 26. Conclusion
    • This paper presented an approach to support end-users through the entire process of creating a dependable web clip.
    • Web clipper addresses the shortcomings of existing tools by introducing the notion of training and of dynamic confidence evaluation .
  • 27. Finish
    • Thanks for your patience!