SEARCH ENGINE OPTIMIZATION




                             1
WHY OPTIMIZE?
-Being on top is vital. Studies show that users generally do not go deeper in pages and
rarely even scroll down in search results, so being on top is very very important.
-Cheap, practically free vs. advertising
        -Important lesson from SCRUM’s point-of-view towards testing applies here
        too: Build quality in, don’t inspect it!!!
        -When SEO is not a part of end testing / quality assurance, but an integral part
        of the development cycle, it becomes virtually free -> infinite return-on-
        investment!
-Approximately (quick empiric research) 35-50% of all traffic on a regular site comes
through search engines.
        -This number is obviously subject to great variation depending on the type of
        site. A knowledge base type of site like MSDN, Stackoverflow.com or
        Wikipedia is bound to get that number up to around 80-90% but a short-lived
        marketing site with a strong traditional ad campaign might go way lower.




                                                                                           2
WHY OPTIMIZE?
-Web search = navigation
-If your site can't be found from search engines, it doesn't exist.
-Most popular search terms at Google.fi in 2008 (porn-related excluded by Google):
        1. YouTube
        2. Irc
        3. Iltalehti
        4. Iltasanomat
        5. Suomi24
        6. You
        7. Wikipedia
        8. Google
        9. Facebook
        10. Sanakirja




                                                                                     3
SEARCH = GOOGLE
-Arguably the most advanced and unbiased indexing algorithm at the moment.
-Benchmark for all search services
-King of the hill by a HUGE margin… but for how long?
-The only search engine that’s morphed into a verb in our every day language
        -...of course you can also try to google stuff on Bing or Yahoo as well...




                                                                                     4
SEARCH = GOOGLE
http://www.codinghorror.com/blog/archives/001224.html




                                                        5
PAGERANK
-In a nutshell: number and quality of links pointing to your content
-Gets inherited via links in your content, divided by the number of links on the page.




                                                                                         6
PAGERANK
-The "reputation" of a page
       -We also generally rather trust people we know are reputable and people who
       reputable people refer to us.




                                                                                     7
2 KINDS OF SEO

WHITE HAT SEO
-The strategies and techniques discussed here
-Leads to SUCCESS and HAPPINESS!




                                                8
2 KINDS OF SEO

BLACK HAT SEO
-Link spamming (blog / forum comments etc)
-Link farming
-Hidden texts
-Hacked pages
-Will always lead to severe Pagerank-penalties and bans from index when caught




                                                                                 9
2 KINDS OF SEO

BLACK HAT SEO
Preventing on your own site:
-rel=nofollow
       -Helpful in telling bots which links you don’t want to vouch for.
       -Can be used for site-internal ”Pagerank sculpting”, but this has generally been
       downplayed as having little or no effect.
       -Should be used also for links that bots have no use for, eg. ”add to cart”,
       ”login”, ”register” etc.
-Moderation of content
       -Delete all comment / forum spam a.s.a.p.!
-Spambusting: CAPTCHAs, botblockers etc.
-Keep platforms / software updated!!!




                                                                                          10
SEO BEST PRACTICES
-Rule #1: understand how the web works, from all points of view!
       -Protocols, architecture, usability, readability, content, HTML, CSS, JS, server-
       side... EVERYTHING is a part of SEO!
-HTTP status codes, learn them and their meaning
       -Especially 300-series, the most common area of mistakes. (in addition to soft
       404’s)
       -Remember that your server is not only telling the user what is going on, your
       audience also consists of bots.
       -http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
-Sitemaps
       -http://www.sitemaps.org/protocol.php
       -http://en.wikipedia.org/wiki/Sitemaps




                                                                                           11
SEO BEST PRACTICES
-Semantic and valid HTML
       -Meaningful tags instead of generic ones. Use tags like h, strong, em,
       blockquote etc. that describe the content and meaningful class names / ids
       -No presentational stuff in HTML! Leave that to CSS and JS!
-Making sure content is crawlable
       -Make sure your site links it’s own content!
-Progressive enhancement
       -http://www.alistapart.com/articles/understandingprogressiveenhancement/
       -http://en.wikipedia.org/wiki/Progressive_enhancement
-META!
       -Page title – this one’s really important!
       -META-tags (esp. description, but keywords is also good to have)
       -alt-attributes for images
       -title-attributes for links etc.




                                                                                    12
SEO BEST PRACTICES
-One content, one canonical URL
       -rel="canonical“
       -More than one URL for the same content will generally result in dividing
       pagerank between the URLs
-Clean URLs with unnecessary stuff stripped out - no session IDs etc!!!
-Do not let URLs die, redirect gracefully. This is especially important in site
redesigns!!!
       -Either give a clear 404 (with an error page helpful for the user) for truly
       deleted content or spend a few days on making sure all old URL’s 301 to a new
       equivalent page with the same content!
       -Losing existing Pagerank due to a tech / layout revamp is just plain stupid and
       will hurt your ranking for a good while.
-Robots.txt
       -Tell bots what they should not do.
       -http://en.wikipedia.org/wiki/Robots_Exclusion_Standard




                                                                                          13
COMMON SEO PROBLEMS
-Heavy and poorly planned use of images / Flash for content
       -Flash is crawled to some extent nowadays, but is still far from the level of
       crawlability of plain ol’ HTML.
-Non-crawlable HTML
       -Severely broken HTML with poor machine-readability.
-Content hidden behind Javascript interaction
       -Too much AJAX-interaction with no HTML-fallback
       -Select-box dropdown menus
       -Bots don’t do JS. This can be easily checked against by browsing the site with
       JS off. If you can’t access the content that way, neither can bots.
-Poor or non-relevant META
       -All unique content should have unique and meaningful metadata: title,
       description, keywords.
-Worst of all: poor content!
       -Nothing will save you if your content is just plain bad.
-Bad HTTP
       -302’s (temp redirect) when you should use 301’s (permanent redirect).
       -200’s when you should give an error like 404. Also called ”soft-404’s”.




                                                                                         14
COMMON SEO MISCONCEPTIONS
-Valid HTML not always a must
        -It's ok to bend rules if you know how and why.
-Div-layout not a silver bullet and tables are not always bad.
        -Use tables when applicable. A hack-free table layout is by far better than a
        glued-together div-layout which suffers from “divitis” (= way too many div-
        tags) or worse yet, is kept together by javascript.
        -Also, it’s still ALWAYS ok to use table-tags to present table data. That’s what
        they’re for!
        -You should still steer clear of nested table structures.
        -Most important thing is that content that should be together is kept together
        and not broken apart by markup.




                                                                                           15
COMMON SEO MISCONCEPTIONS
-There are no silver bullets
       -SEO is always a sum of it’s parts – or as strong a chain as it’s weakest link.
-There are no golden tickets or insider favoritism
       -Advertising on Google won’t rank you higher in unsponsored search results




                                                                                         16
Understanding non-technical issues as well as tech stuff
-Strong content is vital
        -By far the most important part of SEO! Why would a search engine give a
        hoot about your content if it sucks?!
-Does your page actually have the terms you want to be found with in a meaningful
way?
        -If you want to be found with ”WRT widgets”, does your page actually have
        those words? Does it have them in a prominent position that tells the bots
        that they’re actually relevant to your content?
-What do people actually search for and why? Is your site a relevant answer to that
search?
        -Put yourself in the position of the kind of user you wish to attract. What will
        they be searching for and is that content found on your site?
-It's sometimes better to "own" a specific query than do ok with a more general one
        -You’ll never be found at the top of Google with a generic term like
        ”professional” or ”super cool dude”. Opt to ”own” something far more specific
        instead.
        -Good ideas on this (in Finnish):
                 -http://gurumarkkinointi.fi/2009/08/26/kuinka-guru-suojautuu-
                 kilpailulta-osa-1/
                 -http://gurumarkkinointi.fi/2009/09/02/kuinka-guru-suojautuu-
                 kilpailulta-osa-2/
                 -http://gurumarkkinointi.fi/2009/09/08/kuinka-guru-suojautuu-
                 kilpailulta-osa-3/



                                                                                           17
GOOGLE TOOLS - use them!
-Webmaster tools - https://www.google.com/webmasters/tools
       -Sitemaps
       -Diagnostics
       -Crawl stats
       -Top search queries
       -Keywords
       -Etc.
-Google Analytics - http://www.google.com/analytics/




                                                             18
LINKS
(also see links from all previous slides)
-Matt Cutts - http://www.mattcutts.com/blog/
        -The head of Google’s webspam team, a Googler since 2000 and a _really_
        interesting blogger that more often than not provides really helpful insight on
        SEO-stuff
-Google Webmaster Central blog - http://googlewebmastercentral.blogspot.com/
        -A really helpful blog straight from the horse’s mouth.
        -Up-to-date tips, news on new features and great article series.
-http://www.seomoz.org
        -Great collection of SEO-related articles and resources.




                                                                                          19
THANK YOU!




             20

Search engine optimization

  • 1.
  • 2.
    WHY OPTIMIZE? -Being ontop is vital. Studies show that users generally do not go deeper in pages and rarely even scroll down in search results, so being on top is very very important. -Cheap, practically free vs. advertising -Important lesson from SCRUM’s point-of-view towards testing applies here too: Build quality in, don’t inspect it!!! -When SEO is not a part of end testing / quality assurance, but an integral part of the development cycle, it becomes virtually free -> infinite return-on- investment! -Approximately (quick empiric research) 35-50% of all traffic on a regular site comes through search engines. -This number is obviously subject to great variation depending on the type of site. A knowledge base type of site like MSDN, Stackoverflow.com or Wikipedia is bound to get that number up to around 80-90% but a short-lived marketing site with a strong traditional ad campaign might go way lower. 2
  • 3.
    WHY OPTIMIZE? -Web search= navigation -If your site can't be found from search engines, it doesn't exist. -Most popular search terms at Google.fi in 2008 (porn-related excluded by Google): 1. YouTube 2. Irc 3. Iltalehti 4. Iltasanomat 5. Suomi24 6. You 7. Wikipedia 8. Google 9. Facebook 10. Sanakirja 3
  • 4.
    SEARCH = GOOGLE -Arguablythe most advanced and unbiased indexing algorithm at the moment. -Benchmark for all search services -King of the hill by a HUGE margin… but for how long? -The only search engine that’s morphed into a verb in our every day language -...of course you can also try to google stuff on Bing or Yahoo as well... 4
  • 5.
  • 6.
    PAGERANK -In a nutshell:number and quality of links pointing to your content -Gets inherited via links in your content, divided by the number of links on the page. 6
  • 7.
    PAGERANK -The "reputation" ofa page -We also generally rather trust people we know are reputable and people who reputable people refer to us. 7
  • 8.
    2 KINDS OFSEO WHITE HAT SEO -The strategies and techniques discussed here -Leads to SUCCESS and HAPPINESS! 8
  • 9.
    2 KINDS OFSEO BLACK HAT SEO -Link spamming (blog / forum comments etc) -Link farming -Hidden texts -Hacked pages -Will always lead to severe Pagerank-penalties and bans from index when caught 9
  • 10.
    2 KINDS OFSEO BLACK HAT SEO Preventing on your own site: -rel=nofollow -Helpful in telling bots which links you don’t want to vouch for. -Can be used for site-internal ”Pagerank sculpting”, but this has generally been downplayed as having little or no effect. -Should be used also for links that bots have no use for, eg. ”add to cart”, ”login”, ”register” etc. -Moderation of content -Delete all comment / forum spam a.s.a.p.! -Spambusting: CAPTCHAs, botblockers etc. -Keep platforms / software updated!!! 10
  • 11.
    SEO BEST PRACTICES -Rule#1: understand how the web works, from all points of view! -Protocols, architecture, usability, readability, content, HTML, CSS, JS, server- side... EVERYTHING is a part of SEO! -HTTP status codes, learn them and their meaning -Especially 300-series, the most common area of mistakes. (in addition to soft 404’s) -Remember that your server is not only telling the user what is going on, your audience also consists of bots. -http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html -Sitemaps -http://www.sitemaps.org/protocol.php -http://en.wikipedia.org/wiki/Sitemaps 11
  • 12.
    SEO BEST PRACTICES -Semanticand valid HTML -Meaningful tags instead of generic ones. Use tags like h, strong, em, blockquote etc. that describe the content and meaningful class names / ids -No presentational stuff in HTML! Leave that to CSS and JS! -Making sure content is crawlable -Make sure your site links it’s own content! -Progressive enhancement -http://www.alistapart.com/articles/understandingprogressiveenhancement/ -http://en.wikipedia.org/wiki/Progressive_enhancement -META! -Page title – this one’s really important! -META-tags (esp. description, but keywords is also good to have) -alt-attributes for images -title-attributes for links etc. 12
  • 13.
    SEO BEST PRACTICES -Onecontent, one canonical URL -rel="canonical“ -More than one URL for the same content will generally result in dividing pagerank between the URLs -Clean URLs with unnecessary stuff stripped out - no session IDs etc!!! -Do not let URLs die, redirect gracefully. This is especially important in site redesigns!!! -Either give a clear 404 (with an error page helpful for the user) for truly deleted content or spend a few days on making sure all old URL’s 301 to a new equivalent page with the same content! -Losing existing Pagerank due to a tech / layout revamp is just plain stupid and will hurt your ranking for a good while. -Robots.txt -Tell bots what they should not do. -http://en.wikipedia.org/wiki/Robots_Exclusion_Standard 13
  • 14.
    COMMON SEO PROBLEMS -Heavyand poorly planned use of images / Flash for content -Flash is crawled to some extent nowadays, but is still far from the level of crawlability of plain ol’ HTML. -Non-crawlable HTML -Severely broken HTML with poor machine-readability. -Content hidden behind Javascript interaction -Too much AJAX-interaction with no HTML-fallback -Select-box dropdown menus -Bots don’t do JS. This can be easily checked against by browsing the site with JS off. If you can’t access the content that way, neither can bots. -Poor or non-relevant META -All unique content should have unique and meaningful metadata: title, description, keywords. -Worst of all: poor content! -Nothing will save you if your content is just plain bad. -Bad HTTP -302’s (temp redirect) when you should use 301’s (permanent redirect). -200’s when you should give an error like 404. Also called ”soft-404’s”. 14
  • 15.
    COMMON SEO MISCONCEPTIONS -ValidHTML not always a must -It's ok to bend rules if you know how and why. -Div-layout not a silver bullet and tables are not always bad. -Use tables when applicable. A hack-free table layout is by far better than a glued-together div-layout which suffers from “divitis” (= way too many div- tags) or worse yet, is kept together by javascript. -Also, it’s still ALWAYS ok to use table-tags to present table data. That’s what they’re for! -You should still steer clear of nested table structures. -Most important thing is that content that should be together is kept together and not broken apart by markup. 15
  • 16.
    COMMON SEO MISCONCEPTIONS -Thereare no silver bullets -SEO is always a sum of it’s parts – or as strong a chain as it’s weakest link. -There are no golden tickets or insider favoritism -Advertising on Google won’t rank you higher in unsponsored search results 16
  • 17.
    Understanding non-technical issuesas well as tech stuff -Strong content is vital -By far the most important part of SEO! Why would a search engine give a hoot about your content if it sucks?! -Does your page actually have the terms you want to be found with in a meaningful way? -If you want to be found with ”WRT widgets”, does your page actually have those words? Does it have them in a prominent position that tells the bots that they’re actually relevant to your content? -What do people actually search for and why? Is your site a relevant answer to that search? -Put yourself in the position of the kind of user you wish to attract. What will they be searching for and is that content found on your site? -It's sometimes better to "own" a specific query than do ok with a more general one -You’ll never be found at the top of Google with a generic term like ”professional” or ”super cool dude”. Opt to ”own” something far more specific instead. -Good ideas on this (in Finnish): -http://gurumarkkinointi.fi/2009/08/26/kuinka-guru-suojautuu- kilpailulta-osa-1/ -http://gurumarkkinointi.fi/2009/09/02/kuinka-guru-suojautuu- kilpailulta-osa-2/ -http://gurumarkkinointi.fi/2009/09/08/kuinka-guru-suojautuu- kilpailulta-osa-3/ 17
  • 18.
    GOOGLE TOOLS -use them! -Webmaster tools - https://www.google.com/webmasters/tools -Sitemaps -Diagnostics -Crawl stats -Top search queries -Keywords -Etc. -Google Analytics - http://www.google.com/analytics/ 18
  • 19.
    LINKS (also see linksfrom all previous slides) -Matt Cutts - http://www.mattcutts.com/blog/ -The head of Google’s webspam team, a Googler since 2000 and a _really_ interesting blogger that more often than not provides really helpful insight on SEO-stuff -Google Webmaster Central blog - http://googlewebmastercentral.blogspot.com/ -A really helpful blog straight from the horse’s mouth. -Up-to-date tips, news on new features and great article series. -http://www.seomoz.org -Great collection of SEO-related articles and resources. 19
  • 20.