“ Search engine ”Team Members Prashant mathur neha gupta monu k. verma Mohd. shoaib
SPIDERS OPTIMIZERS Search engines crawlers
What is a SEARCH ENGINE ? Search COMPUTING to examine a computer file, disk, database, or network for particular information. Engine Something that supplies the driving force or energy to a movement, system, or trend. Search Engine A computer program that searches for particular keywords and returns a list of documents in which they were found, especially a commercial service that scans documents on the Internet.
Different search engines
What a search engine can do ? Search engines do not search only for keywords, some search for other stuff as well and they are really not “engines” in the classical sense but then mouse is not a “mouse”
Page Rank A page is important when it is referred to a lot, or referred to from an important page PR is used to prioritize; works well even with search is just on page titles
Different search engines & PORTALS
3 basic tasks Internet search engines are special sites on the Web that are designed to help people find information stored on other sites. There are differences in the ways various search engines work, but they all perform three basic tasks: They search the Internet or select pieces from the Internet based on important keywords. They keep an index of the words they find and where they find them. They allow users to look for words or combinations of words found in that index.
WORLD WIDE WEB
Key terms related to search engineworking A search engine operates, in the following order Crawling Follow links to find information Indexing Record what words appear where Ranking What information is a good match to a user query? What information is inherently good?
(contd.) Displaying Find a good format for the information Serving Handle queries, find pages, display results
(contd.) Spiders: To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites. "Spiders" take a Web page's content and create key search words that enable online users to find pages they're looking for. Crawling: When a spider is building its lists, the process is called Web. In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.
(contd.) Indexing: For fast accessing of data. Meta tag: The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields).
Keyword Density Keyword Density: The ratio of the number of occurrences of a word or phrase on a page to the total number of words on the page.
Different search engines
pictorial representation of working ofSearch Engine Check for duplicates, store the documents Standard Web Search Engine Architecture crawl the web DocIds user query create an inverted index Inverted index Search engine servers Show results To user
gopher, Archie, Veronica Programs with names like “gopher”, “Archie”, ”Veronica” kept indexes of files stored on servers connected to the Internet, and dramatically reduced the amount of time required to find programs and documents. In the late 1980s, getting serious value from the Internet meant knowing how to use Gopher, Archie, and the rest.
Building a Search Building a more complex query requires the use of Boolean operators that allow you to refine and extend the terms of the search. The Boolean operators most often seen are: AND - All the terms joined by "AND" must appear in the pages or documents. Some search engines substitute the operator "+" for the word AND. OR - At least one of the terms joined by "OR" must appear in the pages or documents.
(contd.) NOT - The term or terms following "NOT" must not appear in the pages or documents. Some search engines substitute the operator "-" for the word NOT. FOLLOWED BY - One of the terms must be directly followed by the other. NEAR - One of the terms must be within a specified number of words of the other. Quotation Marks - The words between the quotation marks are treated as a phrase, and that phrase must be found within the document or file.
Search Engine Optimization Is the process of improving the volume and quality of traffic to a website from search engine. As a marketing strategy for increasing a site's relevance, SEO considers how search algorithm work and what people search for. Everyone knows they should be doing it.
OPTIMIZATION Search engines often index millions of pages for certain key words, and your website can be buried deep on page 100 or worse if it is not optimized properly. Building a website or adjusting a website so that it comes up on the first or second page of search engine results is what our SEO Service does, and this is known as search engine optimization. If your page is not optimized it will not get a high ranking and you will not get results, no matter how many search engines the site has been submitted to.
(contd.) Effective SEO may require changes to the html source code of a site, SEO tactics may be incorporated into web site development and design. The term "search engine friendly" may be used to describe web site designs, menus, content management system and shopping carts that are easy to optimize. Hardly anyone actually does.
High Quality Search The biggest problem facing users of web search engines today is the quality of the results they get back. While the results are often amusing and expand user’s horizons, they are often frustrating and consume precious time. In order to accomplish this Google makes heavy use of hypertextual information consisting of link structure and link (anchor) text.
Percentage of web users who visit the site shown
limitations Every search engine has limitation as to coverage. Some have compromised search with economics i.e. becoming little more than advertisers . Search engines are also many times victims of spam indexing affecting what is included and how ranked.
Increasing numbers of indexed pages
Facts Most search engines have vanished. Google is a big player. 63% of Internet users use a search engine in a given session. Approximately 94 million adults use the internet on an average day. This means approximately 59.22 MILLION people use search engines in an average day. Microsoft realized Internet is here to stay Dominates the browser market. Realizes search is critical.
conclusion Search Engine is designed for getting relevant results. The primary goal is to provide high quality search results over a rapidly growing World Wide Web. e.g. Google employs a number of techniques to improve search quality including page rank, anchor text, and proximity information. Furthermore, Google is a complete architecture for gathering web pages, indexing them, and performing search queries over them.