Basics When you sit down at your computer and do a search, you're almost instantly presented with a list of results from all over the web. How does a search engine find web pages matching your query, and determine the order of search results? In the simplest terms, you could think of searching the web as looking in a very large book with an impressive index telling you exactly where everything is located. When you perform a search, the search engine checks its index to determine the most relevant search results to be returned ("served") to you. The three key processes in delivering search results to you are: Crawling Search engines run automated programs, called "bots" or "spiders", that use the hyperlink structure of the web to "crawl" the pages and documents that make up the World Wide Web. Estimates are that of the approximately 20 billion existing pages, search engines have crawled between 8 and 10 billion. Indexing Once a page has been crawled, its contents are "indexed" - stored in a gi-normous database of documents that makes up a search engine's "index". This index needs to be tightly managed so that requests which must search and sort billions of documents can be completed in fractions of a second. Processing Queries When a request for information comes into the search engine (hundreds of millions do each day), the engine retrieves from its index all the documents that match the query. A match is determined if the terms or phrase is found on the page in the manner specified by the user. For example, a search for car and driver magazine at Google returns ~5.8 million results, but a search for the same phrase in quotes ("car and driver magazine") returns only ~80 thousand results. Ranking Results Once the search engine has determined which results are a match for the query, the engine's algorithm (a mathematical equation commonly used for sorting) runs calculations on each of the results to determine which is most relevant to the given query. They sort these on the results pages in order from most relevant to least so that users can make a choice about which to select. Serving results When a user enters a query, our machines search the index for matching pages and return the results we believe are the most relevant to the user. Relevancy is determined by over 200 factors, one of which is the PageRank for a given page. PageRank is the measure of the importance of a page based on the incoming links from other pages. In simple terms, each link to a page on your site from another site adds to your site's PageRank. Not all links are equal: Google works hard to improve the user experience by identifying spam links and other practices that negatively impact search results. The best types of links are those that are given based on the quality of your content. In order for your site to rank well in search results pages, it's important to make sure that SE’s can crawl and index your site correctly.
Give visitors the information they're looking for Provide high-quality content on your pages, especially your homepage. This is the single most important thing to do. If your pages contain useful information, their content will attract many visitors and entice webmasters to link to your site. In creating a helpful, information-rich site, write pages that clearly and accurately describe your topic. Think about the words users would type to find your pages and include those words on your site. Make sure that other sites link to yours Links help crawlers find your site and can give your site greater visibility in search results. When returning results for a search, Google uses sophisticated text-matching techniques to display pages that are both important and relevant to each search. Google interprets a link from page A to page B as a vote by page A for page B. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important.“ Natural links to your site develop as part of the dynamic nature of the web when other sites find your content valuable and think it would be helpful for their visitors. Unnatural links to your site are placed there specifically to make your site look more popular to search engines. Only natural links are useful for the indexing and ranking of your site. Make your site easily accessible Build your site with a logical link structure. Every page should be reachable from at least one static text link.
Things to avoid Don't fill your page with lists of keywords, attempt to "cloak" pages, or put up "crawler only" pages. If your site contains pages, links, or text that you don't intend visitors to see, crawlers consider those links and pages deceptive and may ignore your site. Cloaking = technique in which the content presented to the search engine spider is different to that presented to the user's browser Don't use images to display important names, content, or links. Crawlers don't recognize text contained in graphics. Use ALT attributes if the main content and keywords on your page can't be formatted in regular HTML. Don't create multiple copies of a page under different URLs. Many sites offer text-only or printer-friendly versions of pages that contain the same content as the corresponding graphic-rich pages. Don't feel obligated to purchase a search engine optimization service. Some companies claim to "guarantee" high ranking for your site. While legitimate consulting firms can improve your site's flow and content, others employ deceptive tactics in an attempt to fool search engines. Be careful; if your domain is affiliated with one of these deceptive services, it could be banned from indexes.
Good practices for page title tags Accurately describe the page's content Choose a title that effectively communicates the topic of the page's content. Avoid: • choosing a title that has no relation to the content on the page • using default or vague titles like "Untitled" or "New Page 1“ Create unique title tags for each page Each of your pages should ideally have a unique title tag, which helps Google know how the page is distinct from the others on your site. Avoid: • using a single title tag across all of your site's pages or a large group of pages Use brief, but descriptive titles Titles can be both short and informative. If the title is too long, Google will show only a portion of it in the search result. --less than 64 characters is ideal --Google allows 66 chars --Yahoo allows 120 chars --HOWEVER! We already use 47 chars with “ - UW Stout, Wisconsin's Polytechnic University”, So truncation potentially will occur at the SE’s Avoid: • using extremely lengthy titles that are unhelpful to users • stuffing unneeded keywords in your title tags
Good practices for description meta tags Accurately summarize the page's content Write a description that would both inform and interest users if they saw your description meta tag as a snippet in a search result. Avoid: • writing a description meta tag that has no relation to the content on the page • using generic descriptions like "This is a webpage" or "Page about baseball cards" • filling the description with only keywords • copy and pasting the entire content of the document into the description meta tag Use unique descriptions for each page Having a different description meta tag for each page helps both users and Google, especially in searches where users may bring up multiple pages on your domain (e.g. searches using the site: operator). If your site has thousands or even millions of pages, hand-crafting description meta tags probably isn't feasible. In this case, you could automatically generate description meta tags based on each page's content. Avoid: • using a single description meta tag across all of your site's pages or a large group of pages Description meta tags are important because SE’s might use them as snippets for your pages. Note that we say "might" because they may choose to use a relevant section of your page's visible text if it does a good job of matching up with a user's query.
Good practices for URL structure Use words in URLs URLs with words that are relevant to your site's content and structure are friendlier for visitors navigating your site. Visitors remember them better and might be more willing to link to them. Avoid: • using lengthy URLs with unnecessary parameters and session IDs • choosing generic page names like "page1.html" • using excessive keywords like "baseball-cards-baseball-cards-baseballcards.htm“ Create a simple directory structure Use a directory structure that organizes your content well and is easy for visitors to know where they're at on your site. Try using your directory structure to indicate the type of content found at that URL. Avoid: • having deep nesting of subdirectories like ".../dir1/dir2/dir3/dir4/dir5/dir6/ page.html" • using directory names that have no relation to the content in them Provide one version of a URL to reach a document To prevent users from linking to one version of a URL and others linking to a different version (this could split the reputation of that content between the URLs), focus on using and referring to one URL in the structure and internal linking of your pages. If you do find that people are accessing the same content through multiple URLs, setting up a 301 redirect from non-preferred URLs to the dominant URL is a good solution for this. Avoid: • having pages from subdomains and the root directory (e.g. "domain.com/page.htm" and "sub.domain.com/page.htm") access the same content • mixing www. and non-www. versions of URLs in your internal linking structure • using odd capitalization of URLs (many users expect lower-case URLs and remember them better)
Good practices for content Write easy-to-read text Users enjoy content that is well written and easy to follow. Avoid: • writing sloppy text with many spelling and grammatical mistakes • embedding text in images for textual content (users may want to copy and paste the text and search engines can't read it) Stay organized around the topic It's always beneficial to organize your content so that visitors have a good sense of where one content topic begins and another ends. Breaking your content up into logical chunks or divisions helps users find the content they want faster. Avoid: • dumping large amounts of text on varying topics onto a page without paragraph, subheading, or layout separation Use relevant language Think about the words that a user might search for to find a piece of your content. Users who know a lot about the topic might use different keywords in their search queries than someone who is new to the topic. For example, a long-time baseball fan might search for [nlcs], an acronym for the National League Championship Series, while a new fan might use a more general query like [baseball playoffs]. Anticipating these differences in search behavior and accounting for them while writing your content (using a good mix of keyword phrases) could produce positive results. Create fresh, unique content New content will not only keep your existing visitor base coming back, but also bring in new visitors. Avoid: • rehashing (or even copying) existing content that will bring little extra value to users • having duplicate or near-duplicate versions of your content across your site Offer exclusive content or services Consider creating a new, useful service that no other site offers. You could also write an original piece of research, break an exciting news story, or leverage your unique user base. Other sites may lack the resources or expertise to do these things. Create content primarily for your users, not search engines Designing your site around your visitors' needs while making sure your site is easily accessible to search engines usually produces positive results. Avoid: • inserting numerous unnecessary keywords aimed at search engines but are annoying or nonsensical to users • having blocks of text like "frequent misspellings used to reach this page" that add little value for users • deceptively hiding text from users, but displaying it to search engines
Good practices for anchor text Choose descriptive text The anchor text you use for a link should provide at least a basic idea of what the page linked to is about. Avoid: • writing generic anchor text like "page", "article", or "click here" • using text that is off-topic or has no relation to the content of the page linked to • using the page's URL as the anchor text in most cases (although there are certainly legitimate uses of this, such as promoting or referencing a new website's address) Write concise text Aim for short but descriptive text—usually a few words or a short phrase. Avoid: • writing long anchor text, such as a lengthy sentence or short paragraph of text Think about anchor text for internal links too You may usually think about linking in terms of pointing to outside websites, but paying more attention to the anchor text used for internal links can help users and SE’s navigate your site better. Avoid: • using excessively keyword-filled or lengthy anchor text just for search engines • creating unnecessary links that don't help with the user's navigation of the site
Good practices for heading tags Imagine you're writing an outline Similar to writing an outline for a large paper, put some thought into what the main points and sub-points of the content on the page will be and decide where to use heading tags appropriately. Avoid: • placing text in heading tags that wouldn't be helpful in defining the structure of the page • using heading tags where other tags like <em> and <strong> may be more appropriate • erratically moving from one heading tag size to another Use headings sparingly across the page Use heading tags where it makes sense. Too many heading tags on a page can make it hard for users to scan the content and determine where one topic ends and another begins. Avoid: • excessively using heading tags throughout the page • putting all of the page's text into a heading tag • using heading tags only for styling text and not presenting structure
Good practices for images Use brief, but descriptive filenames and alt text Like many of the other parts of the page targeted for optimization, filenames and alt text (for ASCII languages) are best when they're short, but descriptive. Avoid: • using generic filenames like "image1.jpg", "pic.gif", "1.jpg" when possible (some sites with thousands of images might consider automating the naming of images) • writing extremely lengthy filenames • stuffing keywords into alt text or copying and pasting entire sentences Supply alt text when using images as links If you do decide to use an image as a link, filling out its alt text helps Google understand more about the page you're linking to. Imagine that you're writing anchor text for a text link. Avoid: • writing excessively long alt text that would be considered spammy • using only image links for your site's navigation Store images in a directory of their own Instead of having image files spread out in numerous directories and subdirectories across your domain, consider consolidating your images into a single directory (e.g. brandonsbaseballcards.com/images/). This simplifies the path to your images. Use commonly supported file types Most browsers support JPEG, GIF, PNG, and BMP image formats. It's also a good idea to have the extension of your filename match with the filetype.
Good practices for promoting your website Blog about new content or services A blog post on your own site letting your visitor base know that you added something new is a great way to get the word out about new content or services. Other webmasters who follow your site or RSS feed could pick the story up as well. Don't forget about offline promotion Putting effort into the offline promotion of your company or site can also be rewarding. For example, if you have a business site, make sure its URL is listed on your business cards, letterhead, posters, etc. You could also send out recurring newsletters to clients through the mail letting them know about new content on the company's website. Know about social media sites Sites built around user interaction and sharing have made it easier to match interested groups of people up with relevant content. Avoid: • attempting to promote each new, small piece of content you create; go for big, interesting items • involving your site in schemes where your content is artificially promoted to the top of these services Reach out to those in your site's related community Chances are, there are a number of sites that cover topic areas similar to yours. Opening up communication with these sites is usually beneficial. Hot topics in your niche or community could spark additional ideas for content or building a good community resource. Avoid: • spamming link requests out to all sites related to your topic area • purchasing links from another site with the aim of getting PageRank instead of traffic
KeywordMatch A word that must appear anywhere in query. KeywordMatches = "Abraham" and "Lincoln" If your KeywordMatch is "Abraham Lincoln", the search query must include both "Abraham" and "Lincoln" to trigger this KeywordMatch. To get a KeywordMatch for either "Abraham" or "Lincoln," then enter two KeywordMatches: one for "Abraham" and one for "Lincoln." PhraseMatch A phrase that appears anywhere in query. For the phrase to match, all of the words must be present, the order of the words must be the same with no intervening words, and any hyphens in the query must be matched. PhraseMatch = "Abraham Lincoln," "President Abraham Lincoln," "Abraham Lincoln president," and "young Abraham Lincoln" These are all phrase KeyMatches because the words appear in the order entered in the search query, "Abraham Lincoln." "Abraham the Tall Lincoln" is not a PhraseMatch because "the Tall" separates the phrase "Abraham Lincoln." ExactMatch Phrase must exactly match the query. ExactMatch = "Abraham Lincoln" Only "Abraham Lincoln" is an ExactMatch for the query. "President Abraham Lincoln" and "Abraham Lincoln's" are not ExactMatches.
Search engine basics Characteristics of SE Friendly Sites Page Layout Tips for Improved SEO Promoting your site Analyzing your site traffic Optimize for Stout’s GSA Advanced searching techniques
Crawling Process by which web pages are discovered and added to an index Indexing Compile a massive index of all the words found on: ◦ The web page itself ◦ Key content tags (ex. title, alt) Process Queries & Rank Results Search the index for matching pages and return in relevant order the most relevant results to the user
They give visitors the info they are looking for They make sure that other sites link to them They make their site easily accessible
Create unique, accurate page titles Make use of “description” meta tag Improve the structure of your URLs Make your site easier to navigate Offer quality content and services Write better anchor/link text Use heading tags appropriately Optimize your use of images
They don’t fill their pages with lists of keywords They don’t “cloak” or put up “crawler only” pages They don’t use images to display important names, content or links They dont create multiple copies of a page under different URLs Don’t necessarily use a SEO service
Accurately describe the page’s content Create unique Title tags for each page Use brief, but descriptive titles Shows in search results
Accurately summarize the page’s content Use unique descriptions for each page “May” show in search results
Use words in URLs Create a simple directory structure Provide one version of a URL to reach a document
Createa naturally flowing hierarchy Use mostly text for navigation Consider what happens when a user removes part of your URL
Write easy-to-read text Stay organized around the topic Use relevant language Create fresh, unique content Offer exclusive content or services Create content primarily for your users, not search engines
Choose descriptive text Write concise text Think about anchor text for internal links too
Imagine youre writing an outline Use headings sparingly across the page
Use brief, but descriptive filenames and alt text Supply alt text when using images as links Store images in a directory of their own Use commonly supported file types
Blog about new content/services Don’t forget about offline promotion Know about social media sites Reach out to a related community
Get insight on how users reach and behave on our site Discover the most popular content on your site Measure the impact of optimizations you make to your site Discover additional keywords that searchers might use to find your site All CommonSpot site traffic is captured with Google Analytics. Contact webmaster group to get started: firstname.lastname@example.org
GSA – What is it? Google Search Appliance Keymatches Keymatches prioritize and establish a specific result in response to a particular search term. Synonyms Same principle as keymatches only it suggests alternate terms, which may bring better results. Go to the following site to submit GSA requests: http://www3.uwstout.edu/webdev/optimizing_google.cfm
Basic Examplesbiking Italy The words biking and Italyrecycle steel OR iron Information on recycling steel or recycling iron"I have a dream “ The exact phrase I have a dreamsalsa –dance The word salsa but NOT the word danceLouis +I France Information about Louis the First (I), weeding out other kings of Francecastle ~glossary Glossaries about castles, as well as dictionaries, lists of terms, terminology, etc.fortune-telling All forms of the term, whether spelled as a single word, a phrase, or hyphenateddefine:imbroglio Definitions of the word imbroglio from the Web
Calculation Examples + – * / basic arithmetic 12 + 34 - 56 * 7 / 8 % of percentage of 45% of 39 ^ or ** raise to a power2^5 or 2**5 old units in new units convert units 300 Euros in USD , 130 lbs in kg , 31 in hex
Restrict Searchcity1 city2 Book flightssite: Search only one website or domain.[#]..[#] Search within a range of numbers.filetype: (or ext:) Find documents of the specified type.link: Find linked pages i.e., show pages that point to the URL.book (or books) Search full-text of books.phonebook: Disney CA Search for Disneys phone numbers in CArphonebook: Show residential phonebook listings. movie: Find reviews and show times. stocks: Given ticker symbols, show stock infoweather Show weather for a given a location (zip code or city)
Restrict Searchinfo: (or id:) Find info about a page.related: List web pages that are similar/related to the URLinanchor: The terms must appear in anchor text of links to the pageintext: The terms must appear in the text of the pageintitle: The terms must appear in the title of the pageinurl: The terms must appear in the URL of the page
Webmaster Group: email@example.com http://www3.uwstout.edu/webdev SEO Cheat Sheet: http://www.seomoz.org/user_files/SEO_Web_Developer_Cheat_Sheet.pdf SEO Guide (more in depth): http://www.seomoz.org/article/beginners-1-page Google Searching Help: http://search.uwstout.edu/user_help.html http://www.googleguide.com/