Search Engines
comScore February 2011 Rankings




                                  Google   AOL
                                  Bing     Ask
  65.6%               1.7%        Yahoo

                       13.1%

                       3.4%

                 16.1%
Why?
“SEO Expert”
== “Spammer”
White Hat vs.
Black Hat
Discovery +
Navigation
A Story...
“Not my audience!”
“Not my audience!”
“Experts” Not Needed
Professional Practices
• User-Centric Design
• Test-Driven Development
• DRY and Maintainable Code
• Server Performance
• Client-Side Performance
• Search Engine Considerations
Six Simple Rules
• Can’t outsmart Google (or Bing or Y!)
• Follow Google’s advice
• Obey conventions and standards
• Stay away from hacks
• Understand how search engines work
• Think like a searcher
Search Engine Pipeline
• Crawling
• Indexing
• Ranking
<crawling>
Discovery
• Links to your pages from other sites
• Links to your pages from within your site
• Your sitemap.xml
Check internal links


$ wget --mirror
sitemap.xml
• Tell search engines exactly what you
 want them to crawl
• sitemaps.org
• Limit per sitemap: 50,000 URLs, 10MB
• Can specify multiple sitemaps with a
 sitemap index
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/
schemas/sitemap/0.9">
   <url>
      <loc>http://example.com/about</loc>
      <lastmod>2010-01-01</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.8</priority>
   </url>
</urlset>
Generating sitemap.xml
• Write it by hand, stick it in public/
• Build a controller, action, and route
 entry to respond to ‘sitemap.xml’. Use
 XML Builder to generate the entries.
 Cache it.
• Importantly: Strive for 100% coverage.
robots.txt
• Exclusion rather than inclusion
• robotstxt.org



User-agent: *
Disallow: /profile
Be nice to the crawler
• Be performant. Fast server response.
 Fast page load. Compress files. Use if-
 modified-since header.
• Non-www vs. www - pick one.

• Ensure unique content. Use <link
 rel=”canonical”/> where
 approriate.
</crawling>
<indexing>
Don’t sabotage it
• Don’t use a 302 redirect when you
 mean a 301 redirect.
• Make sure images, video, Flash,
 Silverlight, and AJAX are accessible.
• See the Google Webmaster Central
 Blog for details.
• Region-specific content? Think about
 the bots.
</indexing>
<ranking>
<title>
• Most important element to search
 engines
• Think long and hard about it
• Keywords! Think like a searcher.
• Best format: Page Title | Site Name
URLs
• Override to_param for pretty URLs.

• Dashes are word separators,
 underscores are not. Use dashes.
• International domains are treated as
 such.
<meta>
• <meta name=”description”
  content=”...” />
• Make it unique for every page. Use
  content_for.
• Shown to users, doesn’t affect ranking.

• <meta name=”keywords” ... /> is
 ignored
Headings and Content
• <h> tags should be used appropriately.

• Page content should match what the
 <title> and <h> tags refer to.
• Limit use of text-indent:-9999px
 and display:none in CSS.
Rich Snippets
• Microformats, RDFa, Microdata
</ranking>
Tools
• Google Webmaster Tools
• Bing Webmaster Tools
• Yahoo! Site Explorer
Five Takeaways
• Think like a searcher
• Create a sitemap.xml

• Optimize your <title>s

• Use Google Webmaster Tools
• Read the Google Webmaster Blog
Search-Friendly Web Development at RubyNation

Search-Friendly Web Development at RubyNation