Web crawling is the process of gathering pages from the Web to index them and support search engines. Crawlers, sometimes called spiders, begin with seed URLs and fetch and parse pages, extracting linked URLs and placing them on a queue to crawl. Effective crawlers provide robustness, politeness, distributed functionality, scalability, high performance, fresh content, and extensibility.