Your SlideShare is downloading. ×
0
Ch19
Ch19
Ch19
Ch19
Ch19
Ch19
Ch19
Ch19
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Ch19

381

Published on

Published in: Technology, Design
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
381
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Chapter 19 Web Crawler
  • 2. Chapter Objectives
    • Provide a case study example from problem statement through implementation
    • Demonstrate how hash tables and graphs can be used to solve a problem
  • 3. Web Crawler
    • A web crawler is a system that searches the web, beginning with a user-designated we page, looking for a designated target string
    • A web crawler follows all of the links on each page that it encounter until there are no more pages or until it reaches a designated limit
  • 4. Web Crawler
    • For this case study, we will create a graphical web crawler with the following requirements
      • Enter a designated starting web page
      • Enter a target string for which to search
      • Limit the search to 50 pages
      • Display the results when done
  • 5. Web Crawler - Design
    • Our web crawler system consists of three high-level components:
      • The driver
      • The graphical user interface
      • The web crawler implementation
        • Makes use of graphs and hashtables
  • 6. Web Crawler - Design
    • The algorithm for the web crawler is as follows
      • Add the starting page to a HashSet of pages to be searched and to our graph
      • Remove a page from the set of pages to be searched
      • Search the page for the target string
        • If string exists, add page to list of results
      • Search the page for links
        • If links have not already been searched, add them to set of pages to be searched and to our graph
      • Repeat the three previous steps until our limit is reached or the set is empty
  • 7. FIGURE 19.1 User interface design
  • 8. FIGURE 19.2 UML description

×