Site Architecture Best Practices for Search Findability - Adam Audette


Published on

The information architecture (IA) of a website is the most essential factor that influences search spidering and (indirectly) indexing and ranking. Above and beyond search findability (the focus here), proper IA is directly related to usability and conversion optimization.

Published in: Technology, Design
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Site Architecture Best Practices for Search Findability - Adam Audette

  1. 1. <ul><li>Adam Audette, President </li></ul><ul><li>AudetteMedia </li></ul>Site Architecture Best Practices for Search Findability
  2. 2. What is Information Architecture? <ul><li>Many Different Applications of IA </li></ul><ul><li>Touches User Experience, Design, SEO, etc. </li></ul><ul><li>Can Strongly Influence Spidering </li></ul><ul><li>Complex Topic </li></ul>
  3. 3. Search Engine Dominance <ul><li>65.5% of users start with a search engine (Enquiro 2007) </li></ul><ul><li>Google dominating search engine usage (61%+) </li></ul>
  4. 6. A Salient Quote: <ul><li>“ We shape our buildings; thereafter they shape us.” - Winston Churchill </li></ul>
  5. 7. Information Architecture (IA) <ul><li>IA is the most essential factor influencing search spidering, indexing, and ranking. </li></ul><ul><li>IA encompasses labeling and navigation </li></ul><ul><li>IA is directly related to usability and conversion optimization. </li></ul>
  6. 8. What Will You Learn? <ul><li>Findability fundamental to project goals </li></ul><ul><li>Allows users to find sites and sections w/in them easily </li></ul><ul><li>Attracts targeted visitors </li></ul><ul><li>Makes the site more navigable and easier to use </li></ul><ul><li>Helps Googlebot and other spiders traverse the site </li></ul>
  7. 9. What is Information Architecture? <ul><li>1. The structural design of shared information environments. </li></ul><ul><li>2. The combination of organization, labeling, search, and navigation systems within web sites and intranets. </li></ul><ul><li>3. The art and science of shaping information products and experiences to support usability and findability. </li></ul><ul><li>4. An emerging discipline and community of practice focused on bringing principles of design and architecture to the digital landscape. </li></ul>
  8. 10. Peter Morville <ul><li>Father of information architecture </li></ul><ul><li>Proponent of Findability </li></ul>
  9. 12. A Simple Definition <ul><li>Information architecture is the semantic structure and organization of digital inventories. </li></ul>
  10. 13. What is Findability? <ul><li>A better term than search engine optimization (SEO) </li></ul><ul><li>Puts the user in the mix </li></ul><ul><li>Describes a site that is easy to find in search engines </li></ul><ul><li>A site that is easy to navigate </li></ul><ul><li>A site that has content organized intuitively </li></ul>
  11. 14. The Balance of Bots and Users <ul><li>The user is often left out in SEO </li></ul><ul><li>Findability is less myopic than SEO </li></ul><ul><li>Findability and IA part of every aspect of web production </li></ul>
  12. 16. People Matter <ul><li>Who do we create sites for? </li></ul><ul><li>We don’t create sites for search engines, we create sites for people. </li></ul><ul><li>Balancing the needs of a spider with the needs of your visitor is a critical distinction. </li></ul>
  13. 17. Pieces of the IA Puzzle 1. Domains 2. Sections 3. Categories 4. Pages 5. Media
  14. 18. Domains <ul><li>Keep URLs as short as possible (within reason) </li></ul><ul><li>Use hyphens to separate fields </li></ul><ul><li>Never use spaces, quotes, ampersands or other bad ascii characters </li></ul><ul><li>URLs are the foundation for crawling and indexing </li></ul><ul><li>Keep session IDs in URL to minimum (2 max) </li></ul>
  15. 19. <ul><li> </li></ul><ul><li> </li></ul><ul><li> </li></ul>
  16. 20. Apache mod_rewrite <ul><li>Common Canonical Homepage Issue </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li>When You Want Only </li></ul><ul><li> / </li></ul>
  17. 21. Domains (con’t) <ul><li>Canonicalization concerns </li></ul><ul><ul><li>Set association in Google Webmaster Tools </li></ul></ul><ul><li>Rewrite rule so preferred domain redirects </li></ul>
  18. 23. Apache Rewrite <ul><li>301 Redirect for Apache </li></ul><ul><li>Write to file called ‘.htaccess’. </li></ul><ul><li>RewriteEngine on </li></ul><ul><li>RewriteCond %{HTTP_HOST} ^ [NC] </li></ul><ul><li>RewriteRule ^(.*)$$1 [L,R=301] </li></ul><ul><li>Redirect to http:// </li></ul><ul><li> . Affects entire domain. </li></ul>
  19. 24. IIS (ASP) Rewrite <ul><li>Use ISAPI: / </li></ul>
  20. 25. Permanent (301) Redirects <ul><li>Always use 301 redirects </li></ul><ul><li>Redirect permanent /old </li></ul>
  21. 26. Crawling & Diagnostics Tools <ul><li>Monitor site in Google Webmaster Tools console </li></ul><ul><li>Use other tools for diagnostics </li></ul><ul><ul><li>Xenu (free): </li></ul></ul><ul><ul><li>Web Link Validator (paid): </li></ul></ul>
  22. 28. Redirects <ul><li>Strive to use 301 permanent redirects where possible </li></ul><ul><li>302 (temporary) redirects do not pass PageRank </li></ul><ul><li>And that brings up PageRank™ </li></ul>
  23. 29. PageRank™ <ul><li>Fundamental to Google’s ranking algorithm </li></ul><ul><li>Misused and misunderstood </li></ul><ul><li>Visible and invisible </li></ul><ul><li>Read more: </li></ul>
  24. 30. Sections and Categories <ul><li>Entry points into deeper content </li></ul><ul><li>Hubs for users to find their way (waypoints) </li></ul><ul><li>Orient users and crawlers with breadcrumbs </li></ul>
  25. 31. Sections and Categories <ul><li>Bridge shallow pages/root domain </li></ul><ul><li>High importance in internal linking </li></ul><ul><li>Label them with relevant terms </li></ul>
  26. 32. Basics of Site Architecture
  27. 33. Keyword Tools <ul><li>Google Keyword Tool: </li></ul><ul><li>Site Analytics (useful for site search too) </li></ul><ul><li>Apply keywords to critical areas </li></ul><ul><ul><li>Title tag </li></ul></ul><ul><ul><li>Meta data (description for snippet) </li></ul></ul><ul><ul><li>Cross links </li></ul></ul><ul><ul><li>Page headers and content </li></ul></ul>
  28. 37. Pages <ul><li>Standards-compliant code </li></ul><ul><li>Semantic structure </li></ul><ul><ul><li>Title tags under 70 characters </li></ul></ul><ul><ul><li>Descriptions under 155 characters </li></ul></ul><ul><ul><li>Relevant, focused, keywords used </li></ul></ul><ul><ul><li>Accurate header tags </li></ul></ul><ul><ul><li>Use of bulleted, numbered lists </li></ul></ul><ul><ul><li>Bolding for emphasis (with <strong>) </li></ul></ul>
  29. 38. Pages <ul><li>Meta keyword tag </li></ul><ul><ul><li>May be used by Yahoo! </li></ul></ul><ul><ul><li>Totally abused and nearly worthless </li></ul></ul><ul><ul><li>Add 3-4 keyword modifiers here </li></ul></ul><ul><li>Is often misused </li></ul><ul><ul><li>When in doubt, remove it! </li></ul></ul>
  30. 40. Pages <ul><li>Aim for W3C standards-compliant code </li></ul><ul><li>Search engines like accessibility too: </li></ul><ul><ul><li>Alternate text attributes </li></ul></ul><ul><li>Keep code external </li></ul><ul><ul><li>js files in includes </li></ul></ul><ul><ul><li>CSS external </li></ul></ul><ul><li>Header tags (h1, h2, etc) should reinforce title tags and body content </li></ul>
  31. 42. Internal Linking <ul><li>Main index page is entry point </li></ul><ul><ul><li>Passes most value </li></ul></ul><ul><li>Use plain text links with relevant keywords </li></ul><ul><li>Create HTML sitemaps (link from each page) </li></ul><ul><ul><li>Ensure sitemap link is above link threshold </li></ul></ul><ul><li>Shallow pages indicate importance </li></ul><ul><ul><li>Deep pages indicate lesser importance </li></ul></ul>
  32. 44. Internal Linking <ul><li>Deep linking from home page </li></ul><ul><ul><li>Where possible and visitor-centric </li></ul></ul><ul><li>Anchor text matters </li></ul><ul><ul><li>Link to pages with relevant link text </li></ul></ul><ul><li>Never use rel=nofollow on internal links </li></ul><ul><ul><li>An SEO technique </li></ul></ul><ul><li>Link Thresholds </li></ul>
  33. 45. Final Considerations <ul><li>Relevant keywords in file names </li></ul><ul><ul><li>Where possible and visitor-centric </li></ul></ul><ul><li>XML Sitemaps </li></ul><ul><ul><li>Excellent for control over crawling </li></ul></ul><ul><ul><li>Add directive to robots.txt </li></ul></ul><ul><ul><ul><li>Sitemap: </li></ul></ul></ul>
  34. 46. XML Sitemap Syntax <ul><li><?xml version=”1.0” encoding=’UTF-8’?> </li></ul><ul><li><urlset xmins=’ ’> </li></ul><ul><li><url> </li></ul><ul><li><loc> /</loc> </li></ul><ul><li><lastmod>1987-05-25</lastmod> </li></ul><ul><li><changefreq>monthly</changefreq> </li></ul><ul><li><priority>0.8</priority> </li></ul><ul><li></url> </li></ul><ul><li></urlset> </li></ul>
  35. 47. XML Sitemap Locations <ul><li>Default Locations Search Engines Look for Sitemaps </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li>Visit / for a free sitemap </li></ul>
  36. 48. Excluding Content <ul><li>Exclude areas that can trap crawlers </li></ul><ul><li>Common Robot Traps </li></ul><ul><ul><li>Input Forms </li></ul></ul><ul><ul><li>Excessive session IDs in URL </li></ul></ul><ul><ul><li>Pages Restricted by Cookies </li></ul></ul><ul><ul><li>Frames </li></ul></ul><ul><ul><li>Logins </li></ul></ul>
  37. 49. Robots.txt Syntax <ul><li>User-agent: * </li></ul><ul><li>Disallow: /account_private/ </li></ul><ul><li>Disallow: /privatefile.html </li></ul><ul><li>User-agent: Googlebot/2.1 </li></ul><ul><li>Disallow: /user_account.php </li></ul>
  38. 50. Robots.txt Syntax <ul><li>Robots.txt recognizes wildcards </li></ul><ul><ul><li>Disallow: /*?zv=* </li></ul></ul>
  39. 51. Other Stuff <ul><li>No ODP </li></ul><ul><ul><li>Search engines use title/descriptions </li></ul></ul><ul><ul><ul><li>Control your snippets </li></ul></ul></ul><ul><ul><li>meta name=“robots” content=“noodp” </li></ul></ul><ul><li>Never use meta tags like: </li></ul><ul><ul><li>meta name=“robots” content=“index, follow” </li></ul></ul>
  40. 52. Link Development <ul><li>Internal Link Development </li></ul><ul><li>Relevant departments and services </li></ul><ul><li>External link development </li></ul><ul><li>Government branches </li></ul><ul><li>Partner organizations </li></ul><ul><li>Local non-profits </li></ul>
  41. 53. Internal Linking Consistency <ul><li>Link to pages within site consistently </li></ul><ul><li>Use descriptive link text where possible </li></ul><ul><li>Aim to stay below link thresholds </li></ul><ul><li>Refrain from using rel=”nofollow” on internal links </li></ul><ul><li>Link to deep pages from shallow pages when possible </li></ul><ul><li>Cross link deep pages </li></ul>
  42. 54. Shallow and Deep Pages
  43. 55. Diagnostic Tools <ul><li>Check server headers </li></ul><ul><ul><li>wget -S </li></ul></ul><ul><ul><li>Live header status: </li></ul></ul><ul><ul><li>Charles web proxy: </li></ul></ul><ul><ul><li>Wireshark: </li></ul></ul>
  44. 58. Diagnostic Tools <ul><li>Web Development </li></ul><ul><ul><li>Firebug: </li></ul></ul><ul><ul><li>Developer Toolbar: </li></ul></ul>
  45. 59. <ul><li>More Reading: </li></ul><ul><li>http:// </li></ul><ul><li>This presentation </li></ul><ul><li>Links and resources </li></ul><ul><li>Book list </li></ul><ul><li>My contact information </li></ul>
  46. 60. AudetteMedia Founded in 1997. Managing over $4M in Media annually. Lead by Internet marketing pioneers . Active speakers at leading search marketing conferences. Featured in international articles and publications on SEO.