Get Your Nerd On: Technical SEOJoomlaDay! Chicago 2012    Jessica Dunbar
What we will cover     Accessibility      IndexabilityOn-Page Ranking Factors
Accessibility
What is Robots.txt?used to restrict search engine crawlers fromaccessing sections of your website
Robots Cheat Sheet
Robots Cheat Sheet
Robots Cheat Sheet
Robots Meta TagsThe robots meta tag is used to tell searchengine crawlers if they are allowed to index aspecific page and ...
What are HTTP Status Codes?response status codes are returned wheneversearch engines or website visitors make arequest to ...
200Ok/Success
301The data requested has been assigned a newURI, the change is permanent.
302The data requested actually resides under adifferent URL, however, the redirection may bealtered on occasion
404
500
503
What are Sitemaps?Sitemaps are an easy way for webmasters toinform search engines about pages on theirsites that are avail...
What is site architecture?
Flash and JavaScript Navigation
Site PerformanceUsers have a very limited attention span, and ifyour site takes too long to load, they will leave.Search e...
http://tools.pingdom.com/fpt/ http://developer.yahoo.com/yslow/https://developers.google.com/spee         d/pagespeed/insi...
Indexability
Site: Command
Search Engine Penalties
Make Sure Youve Been Penalized
Reason(s) for the Penalty
Fix the Sites Penalized Behavior
On-Page Ranking Factors
Best Practice URLs●   Is the URL short and user-friendly?●   Does the URL include relevant keywords?●   Is the URL using s...
What Is Duplicate Content?Duplicate content exists when any two (ormore) pages share the same or similar contentcontent.ht...
True DuplicatesA true duplicate is any page that is 100%identical (in content) to another page. Thesepages only differ by ...
Near DuplicatesA near duplicatediffers from anotherpage (or pages) by avery small amount – itcould be a block oftext, an i...
Cross-domain DuplicatesA cross-domain duplicate occurs when twowebsites share the same piece of content:These duplicates c...
“www” vs. Non-www   www.example.com         Vs.     example.com
Staging Servers   Example.com        Vs.  dev.example.com
Trailing Slashes ("/")    www.example.com          Vs.      example.com
Secure (https) Pages
Home-page Duplicates       example.com            Vs.   example.com/index.php
Duplicate Paths example.com/electronics/ipods              Vs.example.com/apple-products/ipods
Product Variations
Geo-keyword Variations
Other “Thin” Content    www.example.com          Vs.      example.com
Scraped Content
How To Find●   http://www.seomoz.org/blog/duplicate-content-    in-a-post-panda-world
Google Webmaster Tools
Google’s Site: Command
SEOmoz Campaign Manager
Your Own Brain
Questions
Contactslideshare.net/jessicadunbar@jessicadunbardunbar259@gmail.com
Technical SEO | Joomla Day Chicago 2012
Upcoming SlideShare
Loading in...5
×

Technical SEO | Joomla Day Chicago 2012

1,171

Published on

often times SEO is not a technical priority for a development team, mostly because it is difficult and takes a significant amount of invested time and effort. This session will cover how-to information and SEO advice on how to adjust for server and design issues that may be negatively impacting your search engine optimization efforts. We will discuss the 3 main factors of technical SEO: crawling,indexation, and ranking. Additional topics include redirects & server delivery, robots, site architecture, site performance, sitemap protocols, and more.


Published in: Technology, Design
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,171
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • The robots exclusion protocol (REP), or robots.txt is a text file webmasters create to instruct robots (typically search engine robots) on how to crawl & index pages on their website. http://www.seomoz.org/learn-seo/robotstxt
  • This tells engines they can visit but they are not allowed to display the URL in results.
  • HyperText Transfer Protocol (or HTTP) response status codes are returned whenever search engines or website visitors make a request to a web server. These three digit codes indicate the response and status of HTTP requests. http://www.seomoz.org/learn-seo/http-status-codes
  • 200 or 2xx These codes indicate success
  • 3xx are types of redirection codes. 301 is This and all future requests should be directed to the given uri
  • required the client to perform a temporary redirect (the original describing phrase was "Moved Temporarily") Make life easy, Pass your link Juice use a 301
  • The requested resource could not be found. Importance of 404's. Do not redirect 404's to home page or to show a 200 status.
  • indicate cases in which the server is aware that it has encountered an error or is otherwise incapable of performing the request
  • The server is currently unavailable (because it is overloaded or down for maintenance)
  • http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156184 Your site has dynamic content. Your site has pages that aren't easily discovered by Googlebot during the crawl process—for example, pages featuring rich AJAX or images. Manually submit and checkover your sitemap Auto Sitemaps can have baggage NO ERRORS NO 301s
  • Your site architecture defines the overall structure of your website, including its vertical depth (how many levels it has) as well as its horizontal breadth at each level. When evaluating your site architecture, identify how many clicks it takes to get from the homepage to other important pages. Also, evaluate how well pages are linking to others in the site's hierarchy, and make sure the most important pages are prioritized in the architecture. Ideally, you want to strive for a flatter site architecture that takes advantage of both vertical and horizontal linking opportunities. It’s about getting the best, most relevant content in front of users and reducing the number of times they have to click to find it. The same applies to search engines, by flattening your site architecture; you can make potential gains in indexation metrics such as the number of pages generating search engine traffic and the number of pages in a search engine index.
  • Although search engine crawlers are smarter. It is still safer to avoid Flash and JavaScript navigation rather than fix it
  • pages that search engines are allowed to access. how many of those pages are actually being indexed by the search engines.
  • The index and actual counts are roughly equivalent - this is the ideal scenario. The index count is significantly smaller than the actual count - this scenario indicates that the search engines are not indexing many of your site's pages. The index count is significantly larger than the actual count - this scenario usually suggests that your site is serving duplicate content
  • Hopefully, you never have to deal with this. But if you think your site has been penalized, here are 4 steps to help you fix the situation:
  • Be sure you are actually penalize. Use the previous accessibility checks.
  • Step 2: Identify the Reason(s) for the Penalty Once you're sure the site has been penalized, you need to investigate the root cause for the penalty. If you receive a formal notification from a search engine, this step is already complete.
  • Step 3: Fix the Site's Penalized Behavior Step 4: Beg for forgivness
  • For each of the on-page ranking factors, we'll focus on URLS, and Duplicate Conten http://www.seoptimise.com/blog/2011/06/30-new-google-ranking-factors-you-may-over-or-underestimate.html
  • “ www” vs. Non-www For sitewide duplicate content, this is probably the biggest culprit.
  • Duplicate Paths Having duplicate paths to a page is perfectly fine, but when duplicate paths generate duplicate URLs, then you’ve got a problem.
  • Product Variations Product variant pages are pages that come from the main product page and is only different by an option. Example. Ipod Nano, all the same just color variation
  • Geo-keyword Variations Back in the good old days, you just copying all of your pages 100s of times, adding a city name to the URL, and use a find and replace. Content wins here,now days you need to get creative.
  • Other “Thin” Content
  • Scraped Content Scraped content is just copied content, except that you didn’t ask permission. It's illegal STOP IT
  • Google Webmaster Tools In Google Webmaster Tools, you can pull up a list of duplicate TITLE tags and Meta Descriptions Google has crawled. This is a good starting Point.
  • Google’s Site: Command When you already have a sense of where you might be running into trouble and need to take a deeper look Google’s “site:” command is very powerful
  • Technical SEO | Joomla Day Chicago 2012

    1. 1. Get Your Nerd On: Technical SEOJoomlaDay! Chicago 2012 Jessica Dunbar
    2. 2. What we will cover Accessibility IndexabilityOn-Page Ranking Factors
    3. 3. Accessibility
    4. 4. What is Robots.txt?used to restrict search engine crawlers fromaccessing sections of your website
    5. 5. Robots Cheat Sheet
    6. 6. Robots Cheat Sheet
    7. 7. Robots Cheat Sheet
    8. 8. Robots Meta TagsThe robots meta tag is used to tell searchengine crawlers if they are allowed to index aspecific page and follow its links.<meta name="robots" content="noindex">superior to robots.txt
    9. 9. What are HTTP Status Codes?response status codes are returned wheneversearch engines or website visitors make arequest to a web server.
    10. 10. 200Ok/Success
    11. 11. 301The data requested has been assigned a newURI, the change is permanent.
    12. 12. 302The data requested actually resides under adifferent URL, however, the redirection may bealtered on occasion
    13. 13. 404
    14. 14. 500
    15. 15. 503
    16. 16. What are Sitemaps?Sitemaps are an easy way for webmasters toinform search engines about pages on theirsites that are available for crawling.
    17. 17. What is site architecture?
    18. 18. Flash and JavaScript Navigation
    19. 19. Site PerformanceUsers have a very limited attention span, and ifyour site takes too long to load, they will leave.Search engine crawlers have a limited amountof time that they can allocate to each site on theInternet.
    20. 20. http://tools.pingdom.com/fpt/ http://developer.yahoo.com/yslow/https://developers.google.com/spee d/pagespeed/insights
    21. 21. Indexability
    22. 22. Site: Command
    23. 23. Search Engine Penalties
    24. 24. Make Sure Youve Been Penalized
    25. 25. Reason(s) for the Penalty
    26. 26. Fix the Sites Penalized Behavior
    27. 27. On-Page Ranking Factors
    28. 28. Best Practice URLs● Is the URL short and user-friendly?● Does the URL include relevant keywords?● Is the URL using subfolders instead of subdomains?● Does the URL avoid using excessive parameters?● Is the URL using hyphens to separate words?
    29. 29. What Is Duplicate Content?Duplicate content exists when any two (ormore) pages share the same or similar contentcontent.http://www.seomoz.org/learn-seo/duplicate-content
    30. 30. True DuplicatesA true duplicate is any page that is 100%identical (in content) to another page. Thesepages only differ by the URL:
    31. 31. Near DuplicatesA near duplicatediffers from anotherpage (or pages) by avery small amount – itcould be a block oftext, an image, oreven the order of thecontent:
    32. 32. Cross-domain DuplicatesA cross-domain duplicate occurs when twowebsites share the same piece of content:These duplicates could be either “true” or “near”duplicates.
    33. 33. “www” vs. Non-www www.example.com Vs. example.com
    34. 34. Staging Servers Example.com Vs. dev.example.com
    35. 35. Trailing Slashes ("/") www.example.com Vs. example.com
    36. 36. Secure (https) Pages
    37. 37. Home-page Duplicates example.com Vs. example.com/index.php
    38. 38. Duplicate Paths example.com/electronics/ipods Vs.example.com/apple-products/ipods
    39. 39. Product Variations
    40. 40. Geo-keyword Variations
    41. 41. Other “Thin” Content www.example.com Vs. example.com
    42. 42. Scraped Content
    43. 43. How To Find● http://www.seomoz.org/blog/duplicate-content- in-a-post-panda-world
    44. 44. Google Webmaster Tools
    45. 45. Google’s Site: Command
    46. 46. SEOmoz Campaign Manager
    47. 47. Your Own Brain
    48. 48. Questions
    49. 49. Contactslideshare.net/jessicadunbar@jessicadunbardunbar259@gmail.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×