Necessity of SEO… Online advertising drives $6 offline (in stores) for every $1 spent online. Search marketing has a greater impact on in-store sales lift than display advertising—three times greater, in fact 74% of respondents used search engines to find local business information versus 65% who turned to print Yellow Pages, 50% who used Internet Yellow Pages, and 44% who used traditional newspapers. 86% surveyed said they have used the Internet to find a local business, a rise from the 70% figure reported the year before. 80% reported researching a product or service online, then making that purchase offline from a local business
“ Iprospect and Jupiter” Research… 62% of search engine users click on a search result within the first page of results, and 90% within the first three pages. 41% of search engine users who continue their search when not finding what they seek report changing their search term and/or search engine if they do not find what they’re looking for on the first page of results; 88% report doing so after three pages. 36% of users agree that “seeing a company listed among the top results on a search engine makes me think that the company is a top one within its field.”
Assume page A has pages P 1 ...P n which point to it. The parameter d is a damping factor which can be set between 0 and 1. Also C ( P i ) is defined as the number of links going out of page P i . The PageRank of a page A is given as follows:
PR ( A ) = ( 1 - d ) + d ( PR ( P 1 )/ C ( P 1 ) + ... + PR ( P n )/ C ( P n ))
Usually the parameter d is set to 0.85. PageRank or PR ( A ) can be calculated using a simple iterative algorithm.
Other features: anchor text processing, location information management and various data structures, which fully make use of the features of the web.
Web can be viewed as a huge directed graph G(V, E)
where V is the set of web pages (vertices) and E is the set of hyperlinks (directed edges).
Each page may have a number of outgoing edges (forward links) and a number of incoming links (backlinks).
Each backlink of a page represents a citation to the page.
PageRank is a measure of global web page importance based on the backlinks of web pages.
“ Crawlers” or “Spiders” in Web… The link structure of the Web serves to bind together all of the pages that were made public as a result of someone linking to them. Through links, search engines’ automated robots, called crawlers or spiders can reach the many billions of interconnected documents.
How Search Engine evaluate “trust in a Website”…
Key Factor : Click distance between your website and the most trusted websites.
“ Your website” “ Most trusted website” click distance
Search Engine “Retrieval and Ranking” Aspects…
Relevance : Degree to which the content of the documents returned in a search matches the user’s query intention and terms.
Importance or popularity : Relative importance, measured via citation (the act of one work referencing another, as often occurs in academic and business documents) of a given document that matches the user’s query.
Relative authority of the site, and the trust the search engine
Discovers the semantic connectivity between two words .
e.g. . both oranges and bananas are fruits , but both oranges and bananas are not round .
a machine knows an orange is round and a banana is not by scanning thousands of occurrences of the words banana and orange in its index and noting that round and banana do not have great concurrence , while orange and round do.
LSI (Latent Semantic Indexing) based on Fuzzy Logic theory uses semantic analysis to identify related web pages .
e.g , the search engine may notice one page that talks about doctors and another one that talks about physicians, and determine that there is a relationship between the pages based on the other words in common between the pages.
Act as navigational elements for the search engines during crawl and to do a detailed analysis of each web page
search engine performs detailed analysis of all the words and phrases that appear on a web page, and then building a map of that data for it to consider showing your page in the results when a user enters a related search query. This map is referred as semantic map.
technology that can present significant human-readable content that the search engines cannot see is AJAX.
Search engines want their users to have good experiences. If your site is subject to frequent outages, by definition it is not providing a good user experience. So, if the search engine crawler frequently is unable to access your web pages, the search engine will assume that it is dealing with a low-quality site.
Content very similar to or duplicate of other web pages
External links to low-quality/spam sites
Participation in link schemes or actively selling links
Changes in how Google stores the massive amount of data gathered by their robots.
This is a direct response to the rise in new digital media such as streaming videos, blog posts, social media content ( Twitter, facebook ). The old Google infrastructure was built to handle data by way of Collection > Quality Ranking > Sandbox > Indexing. However with the explosion of real-time content, search engines are faced with the daunting task of filtering all this content to provide a real-time search.
Google uses robots that crawl through the web for data ( googlebot ), this is traditionally data that may not change or update in real-time. The caffeine update must include changes to the robot to cater for real-time content. The theory currently is Google has developed several types of robots that differ in its indexing rate and craw rate to cater for different media content.
an increased weighting on domain authority & some authoritative tag type pages ranking (like Technorati tag pages + Facebook tag pages), as well as pages on sites like Scribd ranking for some long tail queries based mostly on domain authority and sorta spammy on page text
perhaps slightly more weight on exact match domain names
perhaps a bit better understanding of related words / synonyms
tuning down some of the exposure for video & some universal search results
the new search engine improves the index size, the speed of the queries and most importantly, changes the value of search engine rankings.
A search for on the new infrastructure, for instance, returns video and news results midway down the page .
A search on the existing infrastructure, however, returns news at the top, video in the middle, and images at the bottom of the page.