Google is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext.
Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.
3.large-scale search engine which addresses many of
the problems of existing systems. It makes especially
heavy use of the additional structure present in
hypertext to provide much higher quality search
How Google Works Google consists of three distinct parts, each of which is run on a distributed network of thousands of low-cost computers and can therefore when we enter a query. 1.carry out fast parallel processing - Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing . 2. Googlebot- a web crawler that finds and fetches web pages. The indexer that sorts every word on every page and stores the resulting index of words in a huge database. 3. The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
3. The search results are returned to the user in a fraction of a second. 1. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book--it tells which pages contain the words that match any particular query term 2. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result. Copyright
crawling technology is needed to gather the web documents and keep them up to date. Storage space must be used efficiently to store indices and, optionally, the documents themselves. indexing The indexing system must process hundreds of gigabytes of data efficiently. Queries must be handled quickly, at a rate of hundreds to thousands per second. . Google is designed to scale well to extremely large data sets. It makes efficient use of storage space to store the index. Its data structures are optimized for fast and efficient access Further, we expect that the cost to index and store text or HTML will eventually decline relative to the amount that will be available This will result in favourable scaling properties for centralized systems like Google.
The Google search engine has two important features that help it produce high precision results. First it makes use of the link structure of the Web to calculate a quality ranking for each web page. This ranking is called PageRank. Second Google utilizes link to improve search results.
PageRank: Bringing Order to the Web The citation (link) graph of the web is an important resource that has largely gone unused in existing web search engines. they have created maps containing as many as 518 million of these hyperlinks These maps allow rapid calculation of a web page’s "PageRank", an objective measure of its citation importance that corresponds well with people’s subjective idea of importance Because of this correspondence, PageRank is an excellent way to prioritize the results of web keyword searches For most popular subjects, a simple text matching search that is restricted to web page titles performs admirably when PageRank prioritizes the results For the type of full text searches in the main Google system, PageRank also helps a great deal .
Anchor Text The text of links is treated in a special way in our search engine. Most search engines associate the text of a link with the page that the link is on. In addition, we associate it with the page the link points to. This has several advantages. First, anchors often provide more accurate descriptions of web pages than the pages themselves. Second, anchors may exist for documents which cannot be indexed by a text-based search engine, such as images, programs, and databases. This makes it possible to return web pages which have not actually been crawled. Note that pages that have not been crawled can cause problems, since they are never checked for validity before being returned to the user. In this case, the search engine can even return a page that never actually existed, but had hyperlinks pointing to it. However, it is possible to sort the results, so that this particular problem rarely happens.
Aside from PageRank and the use of anchor text, Google has several other features. First- it has location information for all hits and so it makes extensive use of proximity in search. Second- Google keeps track of some visual presentation details such as font size of words. Words in a larger or bolder font are weighted higher than other words. Third- full raw HTML of pages is available in a repository.
The Google Advanced Search is of course applicable to texts, terms, files and so on. In that way is possible to do an advanced search in texts with following terms: • Idioms • Format file • Domains • Books • Codes
Figura 1: Google Advanced Image Search
Parse the query.
2. Convert words into wordIDs.
3. Seek to the start of the doclist in the short barrel for every word
4. Scan through the doclists until there is a document that
matches all the search terms.
5. Compute the rank of that document for the query.
6. If we are in the short barrels and at the end of any doclist, seek
to the start of the doclist in the full barrel for every word and go
to step 4.
7. If we are not at the end of any doclist go to step 4. Sort the
documents that have matched by rank and return the top k.
What is a query? It's a request for information from a search engine. A query consists of one or more words, numbers, or phrases that you hope you will find in the search results listings. To enter a query, type in descriptive words into Google's search box. You can use either the search box on Google's home page (shown above) or the search box that always appears at the top of a Google results page . Now press the ENTER key or click on the "Google Search" button to view your search results, which include links to pages that match your query along with relevant snippets (excerpts) with your search terms in a boldface
Search within results You can get the same results in one step fewer by simply specifying additional terms to your previous query. On Internet Explorer and on some other browsers, you can double click on a term to highlight it. Then type a new term or hit the DELETE key to remove the term. Triple click in the search box to highlight your entire query. Enter a new query or hit the DELETE key to remove the old query. l Instead of searching for related topics with a single query, divide the query into several parts. Looking for a job? By searching for tips on each aspect, you'll find more sites than by searching for sites that describe all the aspects of a job search
Google Earth is very famous interactive application mapping program powered by satellite and aerial imagery that covers the vast majority of the planet. Google Earth is generally considered to be remarkably accurate and extremely detailed. Many major cities in the planet have such detailed images that one can zoom in close enough to see vehicles and pedestrians clearly. Consequently there have been some concerns about national security implications in despite of the images has been not updated constantly. Google has many others products through the Google Labs not released yet due it are still being tested for use by general public. One good differential on Google Search is regarding to logic engine based on Boolean Logic created by mathematician Britain George Boole. Therefore the Google engine allows finding words, texts and so on using logic values conditioned to: • The value must be true or false • The value must not be true and false at same time • If true, it is defined as 1 and if false it is defined as 0(zero
Now we came to Google Desktop (2) is desktop search software made by Google for Mac OS X, Linux, and Microsoft Windows. The program allows text searches of a user's e-mails, computer files, music, photos, chats, Web pages viewed, and other "Google Gadgets." Google Desktop have the following features: File indexing : After initially installing Google Desktop, the software completes an indexing of all the files in the computer And after the initial indexing is completed, the software continues to index files as needed. Users can start searching for files immediately after installing the program. After performing searches, results can also be returned in an Internet browser on the Google Desktop Home Page much like the results for Google Web searches.
• Sidebar : Screenshot of gadgets. Google Desktop running on Microsoft Windows Vista. A prominent feature of Google Desktop is the Sidebar, which holds several common Gadgets and resides off to one side of the desktop. The Sidebar is available with the Microsoft Windows version of Google Desktop only. The Sidebar comes pre-installed with the following gadgets: Email - a panel which lets one view one's Gmail messages. Scratch Pad - here one can store random notes; they are saved automatically Photos - displays a slideshow of photos from the "My Pictures" folder . News - shows the latest headlines from Google News, and how long ago they were written. The News panel is personalized depending on the type of news you read. Weathe r - shows the current weather for a location specified by the user. Web Clips - shows recent posts from RSS news feeds. Google Talk - If Google Talk is installed, double clicking the window title will dock it to one's sidebar
Quick Find : When searching in the sidebar, deskbar or floating deskbar, Google Desktop displays a "Quick Find" window. This window is filled with 6 (by default) of the most relevant results from one's computer. These results update as one types so that one can get to what one wants on one's computer without having to open another browser window .
Deskbars: Deskbars are boxes which enable one to type in a search query directly from one's desktop. Web results will open in a browser window and selected computer results will be displayed in the "Quick Find" box (see above). A Deskbar can either be a fixed deskbar, which sits in one's Windows Taskbar, or a Floating Deskbar, which one may position anywhere one wants on one's desktop.