often times SEO is not a technical priority for a development team, mostly because it is difficult and takes a significant amount of invested time and effort. This session will cover how-to information and SEO advice on how to adjust for server and design issues that may be negatively impacting your search engine optimization efforts. We will discuss the 3 main factors of technical SEO: crawling,indexation, and ranking. Additional topics include redirects & server delivery, robots, site architecture, site performance, sitemap protocols, and more.
9. Robots Meta Tags
The robots meta tag is used to tell search
engine crawlers if they are allowed to index a
specific page and follow its links.
<meta name="robots" content="noindex">
superior to robots.txt
10. What are HTTP Status Codes?
response status codes are returned whenever
search engines or website visitors make a
request to a web server.
20. Site Performance
Users have a very limited attention span, and if
your site takes too long to load, they will leave.
Search engine crawlers have a limited amount
of time that they can allocate to each site on the
Internet.
29. Best Practice URLs
● Is the URL short and user-friendly?
● Does the URL include relevant keywords?
● Is the URL using subfolders instead of
subdomains?
● Does the URL avoid using excessive
parameters?
● Is the URL using hyphens to separate words?
30. What Is Duplicate Content?
Duplicate content exists when any two (or
more) pages share the same or similar content
content.
http://www.seomoz.org/learn-seo/duplicate-
content
31. True Duplicates
A true duplicate is any page that is 100%
identical (in content) to another page. These
pages only differ by the URL:
32. Near Duplicates
A near duplicate
differs from another
page (or pages) by a
very small amount – it
could be a block of
text, an image, or
even the order of the
content:
33. Cross-domain Duplicates
A cross-domain duplicate occurs when two
websites share the same piece of content:
These duplicates could be either “true” or “near”
duplicates.
The robots exclusion protocol (REP), or robots.txt is a text file webmasters create to instruct robots (typically search engine robots) on how to crawl & index pages on their website. http://www.seomoz.org/learn-seo/robotstxt
This tells engines they can visit but they are not allowed to display the URL in results.
HyperText Transfer Protocol (or HTTP) response status codes are returned whenever search engines or website visitors make a request to a web server. These three digit codes indicate the response and status of HTTP requests. http://www.seomoz.org/learn-seo/http-status-codes
200 or 2xx These codes indicate success
3xx are types of redirection codes. 301 is This and all future requests should be directed to the given uri
required the client to perform a temporary redirect (the original describing phrase was "Moved Temporarily") Make life easy, Pass your link Juice use a 301
The requested resource could not be found. Importance of 404's. Do not redirect 404's to home page or to show a 200 status.
indicate cases in which the server is aware that it has encountered an error or is otherwise incapable of performing the request
The server is currently unavailable (because it is overloaded or down for maintenance)
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156184 Your site has dynamic content. Your site has pages that aren't easily discovered by Googlebot during the crawl process—for example, pages featuring rich AJAX or images. Manually submit and checkover your sitemap Auto Sitemaps can have baggage NO ERRORS NO 301s
Your site architecture defines the overall structure of your website, including its vertical depth (how many levels it has) as well as its horizontal breadth at each level. When evaluating your site architecture, identify how many clicks it takes to get from the homepage to other important pages. Also, evaluate how well pages are linking to others in the site's hierarchy, and make sure the most important pages are prioritized in the architecture. Ideally, you want to strive for a flatter site architecture that takes advantage of both vertical and horizontal linking opportunities. It’s about getting the best, most relevant content in front of users and reducing the number of times they have to click to find it. The same applies to search engines, by flattening your site architecture; you can make potential gains in indexation metrics such as the number of pages generating search engine traffic and the number of pages in a search engine index.
Although search engine crawlers are smarter. It is still safer to avoid Flash and JavaScript navigation rather than fix it
pages that search engines are allowed to access. how many of those pages are actually being indexed by the search engines.
The index and actual counts are roughly equivalent - this is the ideal scenario. The index count is significantly smaller than the actual count - this scenario indicates that the search engines are not indexing many of your site's pages. The index count is significantly larger than the actual count - this scenario usually suggests that your site is serving duplicate content
Hopefully, you never have to deal with this. But if you think your site has been penalized, here are 4 steps to help you fix the situation:
Be sure you are actually penalize. Use the previous accessibility checks.
Step 2: Identify the Reason(s) for the Penalty Once you're sure the site has been penalized, you need to investigate the root cause for the penalty. If you receive a formal notification from a search engine, this step is already complete.
Step 3: Fix the Site's Penalized Behavior Step 4: Beg for forgivness
For each of the on-page ranking factors, we'll focus on URLS, and Duplicate Conten http://www.seoptimise.com/blog/2011/06/30-new-google-ranking-factors-you-may-over-or-underestimate.html
“ www” vs. Non-www For sitewide duplicate content, this is probably the biggest culprit.
Duplicate Paths Having duplicate paths to a page is perfectly fine, but when duplicate paths generate duplicate URLs, then you’ve got a problem.
Product Variations Product variant pages are pages that come from the main product page and is only different by an option. Example. Ipod Nano, all the same just color variation
Geo-keyword Variations Back in the good old days, you just copying all of your pages 100s of times, adding a city name to the URL, and use a find and replace. Content wins here,now days you need to get creative.
Other “Thin” Content
Scraped Content Scraped content is just copied content, except that you didn’t ask permission. It's illegal STOP IT
Google Webmaster Tools In Google Webmaster Tools, you can pull up a list of duplicate TITLE tags and Meta Descriptions Google has crawled. This is a good starting Point.
Google’s Site: Command When you already have a sense of where you might be running into trouble and need to take a deeper look Google’s “site:” command is very powerful