(Punjab Collage of Technical Education) SEARCH ENGINE AND META SEARCH ENGINESUBMITTED TO: SUBMITTED BY:MISS. SHRUTI JAIN KIRANDEEP KAUR NANCY JAIN SHEENA
Search engineA web search engine is designed to search for information on the World Wide Web andFTP servers. The search results are generally presented in a list of results and are oftencalled hits. The information may consist of web pages, images, information and othertypes of files. Some search engines also mine data available in databases or opendirectories. Unlike web directories, which are maintained by human editors, searchengines operate algorithmically or are a mixture of algorithmic and human input. SCOPE Search Engine Optimization (SEO) has acquired a great position today. To get highervisibility of a website on Google, DMOZ, Yahoo, AltaVista, Dogpile and other searchengines, it is necessary to implement better SEO techniques. Google acquired the bestsearch engine currently; hence most effective SEO efforts and techniques should be doneto achieve high Page Rank on google.com.In today’s online web word age, the role of Search Engine Optimization (SEO) isbecoming increasingly noteworthy across the world especially in USA, UK, Europe,Australia and France. Search Engines are rapidly becoming basic way to get mostaccurate results of searches on Internet. Research has shown that, about 80% of internettraffic is generated through search engines. Approximately 75% of the users staying onlyon the 1st page of the search results and only about 20% of the users go ahead to the 2ndpage of the search result.After achieving great returns in offshore outsourcing web development business in India,Internet Marketing business scope is growing very well. In start of 2000 decade SEO &SEM scope and future was not bright, very few employment creating and very fewpeople know about Search Engine Marketing (SEM) in India. Now a day there is a greatscope & future of SEO / SEM in India and too much creating SEO expert’s job in India.Now Search Engine Marketing become a major department of all sector of businessbecause now every one want to promote own business in all major search engine andinternet areas. A lot of SEO, SEM and Internet Marketing jobs opportunity creating inIndia and even round the globe.Search Engine Optimization has great scope & future in India and all SEO Companies inIndia have a dedicated team of experienced and professional Search engine optimizationexperts, Link Building Experts, Website Promotion, and Search Engine friendly webdevelopers that help you to gain maximum outcome for your interactive internetmarketing campaign. Each search engine optimization experts has huge experience in theSEO industry, and well operational to switch any kind of interactive marketing projects.As search engine optimization is a strategy for improving a companys revenues, somecompanies outsource these operations. There are quite a few professional SEO outfits,which are savvy to the continually changing trends. They also know the golden rule ofGoogle: What has worked in the past will not necessarily deliver as well in the future.Finally, they can devote their entire time to enhance your SEO initiatives.
Types of searchAlthough the text-box and search button is fairly common-place, the type of search—often described in terms of the scope of content the search engine has indexed—is notalways evident.Internal searchAn internal search can only be used to find content on a single website (or intranet orextranet). For example the Motive search, at the top-right of each page, can only be usedto find pages on the Motive website.External or public searchA public search can be used to find content on any website, anywhere on the web. Forexample Google (also see details below on search engine registration).Meta search engineMeta search engine uses the indexes of other search engines to find content, anywhere onthe web. For example Dog pile.Search engine registration Search engine registrationIn addition to a webpage address, a search engine may also require basic informationabout your site, such as a short description of the website, topics covered, and owner.Most public search engines have an ‘Add URL’, ‘Submit URL’ or ‘Suggest a site’ linkthat links to information on how to register a website. This link is typically found in thelist of links at the bottom of the search engine homepage.Once the website has been registered, the search engine will access the website using anindexing program (spider). The indexing program follows all the links on the submittedwebpage to other WebPages under the same domain. It then follows the links it finds onthose WebPages, ‘crawling’ the entire website, to build a index of all the website content.To add a website to its search index, a search engine must first be told where to ‘find it’.Notifying a search engine of a new website is referred to as search engine registration.The registration process involves submitting an entry-level webpage address (URL) to asearch engine. This entry-level page is typically the address of the homepage or sitemap.ADD URL PAGESQuick links to the website registration pages for the top search engines and directories.Google
Yahoo! (requires a Yahoo! account/registration)Bing (formerly MSN Live)Open Directory ProjectsIn addition to a webpage address, a search engine may also require basic informationabout your site, such as a short description of the website, topics covered, and owner.Most public search engines have an ‘Add URL’, ‘Submit URL’ or ‘Suggest a site’ linkthat links to information on how to register a website. This link is typically found in thelist of links at the bottom of the search engine homepage.Once the website has been registered, the search engine will access the website using anindexing program (spider). The indexing program follows all the links on the submittedwebpage to other WebPages under the same domain. It then follows the links it finds onthose WebPages, ‘crawling’ the entire website, to build a index of all the website content. Search engine resultsA search engine results page (SERP) lists WebPages in order of their relevance to thequery entered. The webpage listed at the top of the results page has been selected by thesearch engine as the most likely to provide the content the user is seeking.Each search result listing usually features the destination webpage meta title (as the linktext), followed by a description and/or an excerpt showing the query highlighted in thecontext of the webpage content (concordance). Search engine ranking algorithmsEach search engine has its own method for calculating relevance, usually based on ananalysis of the content of the destination webpage, including:Meta title (visible at the top of the web browser window);Metadata: number of incoming links, (commonly referred to as the page’s ‘popularity’).Popularity-based ranking assumes that the more incoming links a webpage has, the morelikely it is to be a subject ‘authority’;Incoming link text: a search engine may make assumptions about the content of awebsite based on how other people have described it through the text they have used tolink to a site;Use: of appropriate semantic markup, for example, use of heading elements; andPage text.Each of these aspects of the webpage is scored and then weighted. For example, a searchengine may assign a greater weighting to meta title text than other aspects of thewebpage. In this case, a webpage that includes the query in its meta title text may then beranked higher than a webpage where the meta title does not include the query.The calculation each search engine uses to rank webpage relevance is often a closely-guarded secret.
The scores for each aspect of the webpage are combined to determine the overallrelevance of the webpage.The calculation (algorithm) each search engine uses to rank webpage relevance is often aclosely-guarded (and patented) secret. This is both to prevent websites from artificiallyinflating their rankings; and also because the quality of the search results translatesdirectly into user-loyalty, traffic and revenue generating opportunities. How web search engines workHigh-level architecture of a standard Web crawlerA search engine operates in the following order:Web crawlingIndexingSearchingWeb search engines work by storing information about many web pages, which theyretrieve from the html itself. These pages are retrieved by a Web crawler (sometimes alsoknown as a spider) — an automated Web browser which follows every link on the site.Exclusions can be made by the use of robots.txt. The contents of each page are thenanalyzed to determine how it should be indexed (for example, words are extracted fromthe titles, headings, or special fields called Meta tags). Data about web pages are stored inan index database for use in later queries. A query can be a single word. The purpose ofan index is to allow information to be found as quickly as possible. Some search engines,such as Google, store all or part of the source page (referred to as a cache) as well asinformation about the web pages, whereas others, such as AltaVista, store every word ofevery page they find. This cached page always holds the actual search text since it is theone that was actually indexed, so it can be very useful when the content of the currentpage has been updated and the search terms are no longer in it. This problem might beconsidered to be a mild form of linkrot, and Googles handling of it increases usability bysatisfying user expectations that the search terms will be on the returned webpage. Thissatisfies the principle of least astonishment since the user normally expects the searchterms to be on the returned pages. Increased search relevance makes these cached pagesvery useful, even beyond the fact that they may contain data that may no longer beavailable elsewhere.When a user enters a query into a search engine (typically by using key words), theengine examines its index and provides a listing of best-matching web pages according toits criteria, usually with a short summary containing the documents title and sometimesparts of the text. The index is built from the information stored with the data and themethod by which the information is indexed. Unfortunately, there are currently no knownpublic search engines that allow documents to be searched by date. Most search enginessupport the use of the Boolean operators AND, OR and NOT to further specify the searchquery. Boolean operators are for literal searches that allow the user to refine and extendthe terms of the search. The engine looks for the words or phrases exactly as entered.Some search engines provide an advanced feature called proximity search which allowsusers to define the distance between keywords. There is also concept-based searching
where the research involves using statistical analysis on pages containing the words orphrases you search for. As well, natural language queries allow the user to type aquestion in the same form one would ask it to a human. A site like this would be ask.com.The usefulness of a search engine depends on the relevance of the result set it gives back.While there may be millions of web pages that include a particular word or phrase, somepages may be more relevant, popular, or authoritative than others. Most search enginesemploy methods to rank the results to provide the "best" results first. How a searchengine decides which pages are the best matches, and what order the results should beshown in, varies widely from one engine to another. The methods also change over timeas Internet usage changes and new techniques evolve. There are two main types of searchengine that have evolved: one is a system of predefined and hierarchically orderedkeywords that humans have programmed extensively. The other is a system thatgenerates an "inverted index" by analyzing texts it locates. This second form relies muchmore heavily on the computer itself to do the bulk of the work. Different techniques of Searching on GoogleGoogle is one of the best and top Search Engine and if you use Google only to search forwords and phrases, you’re doing it wrong. There are so many things you can do withGoogle Search. The service is loaded with many advanced tricks that you can enablefrom that unassuming search box.Find the current time elsewhere: Don’t bother trying to convert the time from your localsetting to a distant city. Just type time city , as in time Delhi, to see the current time inthat location.Search for a file type: You can look up results that match a specific file type. This trickis great for special searches, such as tracking down a product manual or video file. Trysearch term filetype: three-letter type.For example, I entered Zoom H2 manual filetype: pdf to find the manual for that Zoomrecording device.Weather as reported by Google Search Get the weather: To see the weather for manyU.S. and worldwide cities, type “weather” followed by the city and state, U.S. zip code,or city and country.For example: weather DelhiCalculate and convert: To use Google’s built-in calculator function, simply enter thecalculation you’d like done into the search box. Try typing math problems, such as89*22/(16), or conversions, like 100 yards = ? Meters. Google will do the rest.Track stocks: To see current market data for a given company or fund, type the tickersymbol into the search box. On the results page, you can click the link to see more data
from Google Finance. You can enter a stock’s trading abbreviation, such as GOOG, andthe first result will show the stock’s latest price, a graph of the day, and other financialdetails.Get movie times: On the Web you have a myriad of choices to look up show times, butGoogle’s simplicity is tough to beat .To find reviews and show times for the moviesplaying near you, type "movies" or the name of a current film into the Google search box.If youve already saved your location on a previous search, the top search result willdisplay show times for nearby theaters for the movie youve chosen. Click the Moremovies link to get more-specific listings.Track packages: Have a FedEx, UPS, or USPS tracking number? Just enter it in theGoogle search box for the latest package status.Sports Scores: To see scores and schedules for sports teams type the team name orleague name into the search box. This is enabled for many leagues including the NationalBasketball Association, National Football League, National Hockey League, and MajorLeague Baseball.Music: want the details of a song? use music: song name for Music specific search onGoogleArea Code Lookup: type in the US area code into Google to find out where the area codeis.Format specific search: sometimes finding what you want in Google can be difficult, butGoogle offers a range of format specific search sites. Google News, Blog Search, evenVideo are a few Google sites you can use to find what you’re looking for.Phrase Search: I use this trick regularly. If you’re looking for the exact phrase, not thewords entered, do your search like this “I did but see her passing by”Wildcard: old DOS users will remember doing directory searches using an asterisk (*) asa wildcard, and Google supports wildcard entries as well. Example: blogging *.com.auNot: adding a minus (-) allows you to narrow your search, for example if you wanted tosearch for New York but not City you’d enter New York -CityEither/or. Google looks for the combination of terms you type in, but you can tell it tolook for multiple words, for example Olympic or Gold. The short cut is | so Olympic |Gold works as wellBook Search: If you’re looking for results from Google Book Search, you can enter thename of the author or book title into the search box and we’ll return any book content wehave as part of your normal web results. You can click through on the record to viewmore detailed info about that author or title.
Earthquakes: To see information about recent earthquakes in a specific area type“earthquake” followed by the city and state or U.S. zip code. For recent earthquakeactivity around the world simply type “earthquake” in the search box.Unit Conversion: You can use Google to convert between many different units ofmeasurement of height, weight, and volume among many others. Just enter your desiredconversion into the search box and we’ll do the rest.Synonym Search: If you want to search not only for your search term but also for itssynonyms, place the tilde sign (~) immediately in front of your search term.Dictionary Definitions: To see a definition for a word or phrase, simply type theword “define” then a space, then the word(s) you want defined. To see a list of differentdefinitions from various online sources, you can type “define:” followed by a word orphrase. Note that the results will define the entire phrase.Spell Checker: Google’s spell checking software automatically checks whether yourquery uses the most common spelling of a given word. If it thinks you’re likely togenerate better results with an alternative spelling, it will ask “Did you mean: (morecommon spelling)?”. Click the suggested spelling to launch a Google search for thatterm.Airline Travel Info: To see flight status for arriving and departing U.S. flights, type inthe name of the airline and the flight number into the search box. You can also see delaysat a specific airport by typing in the name of the city or three-letter airport code followedby the word “airport”.For Example: - American airlines 18, Houston airportCurrency Conversion: To use built-in currency converter, simply enter the conversionyou’d like done into the Google search box and we’ll provide your answer directly on theresults page.For Example: 150 GBP in USDPhone Listing: Let’s say someone calls you on your mobile number and you don’t knowwho it is. If all you have is a phone number, you can look it up on Google using thephonebook feature.For Example phonebook: 617-555-1212 (note: the provided number does notwork – you’ll have to use a real number to get any results).Email ThisBlogThis!Share to TwitterShare to Face bookShare to Google BuzzLabels: Internet
Working of a Search EngineMost Web search engines are commercial ventures supported by advertising revenue and,as a result, some employ the practice of allowing advertisers to pay money to have theirlistings ranked higher in search results. Those search engines which do not accept moneyfor their search engine results make money by running search related ads alongside theregular search engine results. The search engines make money every time someone clickson one of these ads.
Meta search engineMeta search engines are search engines that search other search engines. Confused? Toput it simply, a meta search engine submits your query to several other search enginesand returns a summary of the results. Therefore, the search results you receive are anaggregate result of multiple searches.While this strategy gives your search a broader scope than searching a single searchengine, the results are not always better. This is because the meta search engine must useits own algorithm to choose the best results from multiple search engines. Often, theresults returned by a meta search engine are not as relevant as those returned by astandard search engineA metasearch engine is a search tool that sends user requests to several other searchengines and/or databases and aggregates the results into a single list or displays themaccording to their source. Metasearch engines enable users to enter search criteria onceand access several search engines simultaneously. Metasearch engines operate on thepremise that the Web is too large for any one search engine to index it all and that morecomprehensive search results can be obtained by combining the results from severalsearch engines. This also may save the user from having to use multiple search enginesseparately.The term "metasearch" is frequently used to classify a set of commercial search engines,see the list of search engines, but is also used to describe the paradigm of searchingmultiple data sources in real time. The National Information Standards Organization(NISO) uses the terms Federated Search and Metasearch interchangeably to describe thisweb search paradigm.Meta engines don’t have the budget of the superior engines and are simply ignored.Another reason why they are overlooked is they compile the results of multiple searchengines and give unrelated results. Meta search engines reduce the power of the biggersearch engines.Dog pile is perhaps the best-known Meta search engine. It compiles ten different resultsfrom ten different websites and gives relevant information thus by eliminating duplicatedata. Meta search engines save the time for the searcher by cutting down the number ofsearch operations.Meta search engines are the programs that send request to multiple search engines,combine the result and show them. Since they do not have any database with them theysearch in major search engines and show the results.Working style of Best Meta Search Engines: Type the word that you want to search in thesearch menu; once you have typed in a Meta search engine they forward the request tomany primary search engines. Since each primary engine has its own rules andregulations for requesting data, Meta engines change the requested data and they resend.Meta search engines send requests simultaneously and they get processed in parallel bysaving time. Depending on the capacity, the Meta engines get one or more search resultpages from the primary search engines. A few Meta search engines are very helpful indoing an in-depth search.
Once all the results have been received, the next step is to remove duplicate results andshow them. All the results are regularly sorted by the primary engine that supplies result,based on the rank of the result. Some Meta search engines sort results based on userpreferences. Since Meta Search Engines search different search engines results missed byprimary engines are shown here. Saves time by parallel searching. Eliminates duplicateresults. Best combines the separate result sets.Timeouts occurs while searching in different search engines. Most of the Meta searchengines get only ten to fifty results per primary engine.Advanced features and techniques are not available in Meta engines.Meta search engines may exclude one or more major search engines like Google, Yahooand MSN.Primary search engines generally do not view Meta search engines as competition.Before Meta search engines can perform their search operations, they want consentinformation from primary search engines for search operation.Most people use Google, yahoo, MSN-the big three search engines for search operations.But all search engines are not equal. Depending on the type of information the userlooking for we select different search engines. The main purpose of introducing Metasearch engines is to reduce time complexity and to increase optimization in searching.Meta search engines are capable of locating results that you might miss in primaryengines. But, merging results from different search engines into single list sometimesbring up some security issues. By properly understanding the problem we can better lookout for best Meta search engine. OperationsMetasearch engines create what is known as a virtual database. They do not compile aphysical database or catalogue of the web. Instead, they take a users request, pass it toseveral other heterogeneous databases and then compile the results in a homogeneousmanner based on a specific algorithm.No two metasearch engines are alike. Some search only the most popular search engineswhile others also search lesser-known engines, newsgroups, and other databases. Theyalso differ in how the results are presented and the quantity of engines that are used.Some will list results according to search engine or database. Others return resultsaccording to relevance, often concealing which search engine returned which results.This benefits the user by eliminating duplicate hits and grouping the most relevant ones atthe top of the list.Search engines frequently have different ways they expect requests submitted. Forexample, some search engines allow the usage of the word "AND" while others require"+" and others require only a space to combine words. The better metasearch engines tryto synthesize requests appropriately when submitting them.
Architecture of a metasearch engineMetasearch engines create what is known as a virtual database. They do not compile aphysical database or catalogue of the web. Instead, they take a users request, pass it toseveral other heterogeneous databases and then compile the results in a homogeneousmanner based on a specific algorithm.No two metasearch engines are alike. Some search only the most popular search engineswhile others also search lesser-known engines, newsgroups, and other databases. Theyalso differ in how the results are presented and the quantity of engines that are used.Some will list results according to search engine or database. Others return resultsaccording to relevance, often concealing which search engine returned which results.This benefits the user by eliminating duplicate hits and grouping the most relevant ones atthe top of the list.Search engines frequently have different ways they expect requests submitted. Forexample, some search engines allow the usage of the word "AND" while others require"+" and others require only a space to combine words. The better metasearch engines tryto synthesize requests appropriately when submitting them.Architecture diagram….
Observations About Search Engine OperationThe criteria and algorithms used vary from search engine to search engine.The criteria and algorithms are complex, but are not published.The criteria and algorithms change over time, as often as every 2 weeks.There is consolidation among search engines, and new ones are being added continually.Search engines reward simple pages with a high concentration of keywords and keyphrases.Search engines reward repetition of keywords and key phrases, but penalize spamming.Search engines penalize old pages.50% of search engines (including AltaVista and Google) will preferentially index pageswith many links from the outside.Some search engines (DirectHits.com) reward how often a page is selected and howmuch time is spent with it.Websites are only re-indexed infrequently, approx. every 8 weeks to 6 months.No search engines indexes more than 16% of the word’s 800 million URLs (as of Feb1999). See chart. As of mid-1999, Excite could only index 50 million URLs, would droppages at random.Search engines will only…- Index pages in top 2 or 3 directories.- Index the top few hundred words of each page.- Index max of 300 to 400 pages of large websites (only 25 for Excite).- Not index pages with a ? or & in the URL.- Not index dynamically generated pages.Search engines are increasingly falling behind actual web growth. Search Engine OperationsFollowing are the basic three operation of search engine.CrawlingThe set of automated programs known as bots, agents or spiders to crawl the contents ofweb pages, documents by using the Hyperlink structure. There are billions of web pagesavailable on the internet but still all are not crawled by the search engines.Indexed documentsAfter performing the crawling operations now it’s time to keep the crawled contents inthe repository. Search engines maintain a huge repository of documents called “Index” tostore the content in an organized way. It need to tightly managed to entertain the userquery by traversing the billions documents.
Processing QueriesInternet users search for millions of words or phrase each day in search engines. Whenthe user submits his or her query it comes to the search engine where the documents areindexed for better match with the query and return backs the relevant search results.Ranking resultsIts for sure that lots of matches found for the query in the documents, now to decision tobe made for priority to display the search results. Search engines uses complexalgorithms to rank the results based on hundreds of unknown factors to find the mostrelevant results for the query.The main objective of search engines is to provide relevant and better results to user’squeries. In order to do that search engines employed number of complex information orwe can say developed a language which can speak to the Web sites, forums or blogswhich can be understandable or spoken only if the Web Sites adopt the SEO techniques..